🌱 Open Source β–Ύ

🌍 Live Open Source Explorer

Explore live open-source projects and AI models.

Search public open-source repositories from GitHub and AI models from Hugging Face. Every page shows 10 results with clean pagination.

πŸ”Ž Live Search

Search live open-source data

Search GitHub repositories and Hugging Face models directly, then explore stars, downloads, source links and project details.

Reset Search ↻
πŸ”Ž
🌐

Try keywords like automation, CRM, analytics, chatbot, llama or workflow.

Choose where to search live data.

Live Results

GitHub Open Source Repositories

Search: language-data

Page 6

Showing 10 results from 1,705

B

byzer-org/byzer-lang

GitHub Scala Apache License 2.0

Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.

β˜… 1,837 Forks 539 byzer-org Updated 14 Jun 2026
E

yobix-ai/extractous

GitHub Rust Apache License 2.0

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

β˜… 1,758 Forks 96 yobix-ai Updated 25 Jun 2026
C

WongSaang/chatgpt-ui

GitHub Vue MIT License

A ChatGPT web client that supports multiple users, multiple languages, and multiple database connections for persistent data storage. Provides Docker images and quick deployment scripts.

β˜… 1,629 Forks 350 WongSaang Updated 07 Jun 2026
D

Delta-ML/delta

GitHub Python Apache License 2.0

DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/

β˜… 1,607 Forks 283 Delta-ML Updated 18 Jun 2026
W

stanford-oval/WikiChat

GitHub Python Apache License 2.0

WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.

β˜… 1,604 Forks 146 stanford-oval Updated 18 Jun 2026
R

crownpku/Rasa_NLU_Chi

GitHub Python Apache License 2.0

Turn Chinese natural language into structured data δΈ­ζ–‡θ‡ͺ焢语言理解

β˜… 1,533 Forks 420 crownpku Updated 05 Jun 2026
T

emcf/thepipe

GitHub Python MIT License

Get clean data from tricky documents, powered by vision-language models ⚑

β˜… 1,525 Forks 99 emcf Updated 19 Jun 2026
R

code-kern-ai/refinery

GitHub Python Apache License 2.0

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.

β˜… 1,470 Forks 74 code-kern-ai Updated 14 Jun 2026
A

tinyfish-io/agentql

GitHub Python MIT License

AgentQL is a suite of tools for connecting your AI to the web. Featuring a query language and Playwright integrations for interacting with elements and extracting data quickly, precisely, and at scale. Includes REST API, Python and JavaScript SDKs, browser debugger.

β˜… 1,407 Forks 161 tinyfish-io Updated 24 Jun 2026
C

aws-cloudformation/cloudformation-guard

GitHub Rust Apache License 2.0

Guard offers a policy-as-code domain-specific language (DSL) to write rules and validate JSON- and YAML-formatted data such as CloudFormation Templates, K8s configurations, and Terraform JSON plans/configurations against those rules. Take this survey to provide feedback about cfn-guard: https://a... Read more

β˜… 1,383 Forks 196 aws-cloudformation Updated 22 Jun 2026
Pagination Page 6 of 100

10 results on this page Β· 1,705 total found

Showing first 1,000 accessible GitHub results.