๐ Live Open Source Explorer
Explore live open-source projects and AI models.
Search public open-source repositories from GitHub and AI models from Hugging Face. Every page shows 10 results with clean pagination.
๐ Live Search
Search live open-source data
Search GitHub repositories and Hugging Face models directly, then explore stars, downloads, source links and project details.
Live Results
GitHub Open Source Repositories
Search: language-data
Page 3
Showing 10 results from 1,705
imoneoi/openchat
GitHub Python Apache License 2.0OpenChat: Advancing Open-source Language Models with Imperfect Data
External source
GitHub
umpirsky/country-list
GitHub HTML MIT License:globe_with_meridians: List of all countries with names and ISO 3166-1 codes in all languages and data formats.
External source
GitHub
togethercomputer/RedPajama-Data
GitHub Python Apache License 2.0The RedPajama-Data repository contains code for preparing large datasets for training large language models.
External source
GitHub
lk-geimfari/mimesis
GitHub Python MIT LicenseMimesis is a fast Python library for generating fake data in multiple languages.
External source
GitHub
SPLWare/esProc
GitHub Java Apache License 2.0esProc SPL is a JVM-based programming language designed for structured data computation, serving as both a data analysis tool and an embedded computing engine.
External source
GitHub
kaitai-io/kaitai_struct
GitHub ShellKaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby / Rust
External source
GitHub
yizhongw/self-instruct
GitHub Python Apache License 2.0Aligning pretrained language models with instruction data generated by themselves.
External source
GitHub
apache/fory
GitHub Java Apache License 2.0A blazingly fast multi-language serialization framework for idiomatic domain objects, schema IDL, and cross-language data exchange.
External source
GitHub
NVlabs/VILA
GitHub Python Apache License 2.0VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
External source
GitHub
maziyarpanahi/openmed
GitHub Python Apache License 2.0Local-first healthcare AI: clinical NER & HIPAA PII de-identification that runs 100% on-device. 1,000+ medical models, 12 languages, Apple MLX + Python, no cloud, no patient data leaving your network. Apache-2.0
External source
GitHub
10 results on this page ยท 1,705 total found
Showing first 1,000 accessible GitHub results.