Wonderful Tools

A collection of thoughtfully crafted tools designed with care and purpose, to enhance your digital experience and solve real problems with elegance.

Scroll to explore

goETL

Modern ETL for LLM Dataset Preparation

📦 GOETL GOETL is a modern, extensible ETL (Extract, Transform, Load) utility designed for preparing datasets for LLM (Large Language Model) training and analytics. It supports both CLI and REST API modes, and comes with a sleek React-based web UI for interactive dataset preparation.

Extract

Effortlessly extract textual data from pdf nad txt files.

Transform

Clean, tokenize, and chunk text for LLM-friendly datasets.

Load

Output to JSONL, CSV, or directly to databases (Postgres, MySQL, SQLite, MongoDB, Redis)

Semantic Codebase Analysis

Generate semantic graphs from code directories

REST API

Run as a web service for programmatic or UI-driven ETL

Web UI

Intuitive React frontend for easy job configuration and monitoring

Kubernetes & Docker Ready

Production-grade deployment with Caddy reverse proxy

goETL

OpenData

An Award winning Open Data Transformation and processing Toolkit and pipeline end-point

Open-T-DATA is an open-source initiative focused on streamlining the extraction, transformation, and utilization of open datasets. It provides tools to simplify working with large-scale public datasets by offering efficient data processing pipelines and user-friendly APIs for developers and data enthusiasts.

Seamless Data Extraction:

Automated ingestion from public APIs and open data sources.

Data Transformation Pipelines:

Process and structure raw data into easy-to-use formats (JSON, CSV, etc.).

Open APIs

transformed datasets through an intuitive API interface.

Extensible Architecture:

Build custom data workflows tailored to your needs.

Community-Driven:

Designed for collaboration, with contributions and feedback encouraged. Tech Stack

OpenData