Data
Evals
| Category | Tool | Remarks |
|---|---|---|
| Libraries | athina-evals, deepeval, geval, openevals, promptfoo, inspect-ai | |
| evaluate, torchmetrics, torcheval | Off-the-shelf metrics | |
| mir-eval | ||
| Agents | agentsevals | |
| Benchmarks | openai-evals, yet-another-applied-llm-benchmark, lm-evaluation-harness, openbench | |
| llmtest-needleinahaystack | Needle in a Hatstack | |
| agentbench, textarena | Agents | |
| open-llm-leaderboard, LMArena | Arena | |
| Diversity | diversity | |
| Hallucination | lettucedetect, hallucination-leaderboard, selfcheckgpt | |
| RAG | ragas, ragchecker | |
| auto-evaluator | Generate synthetic QA pairs from docs | |
| chunking-evaluation | Evaluate chunking strategies | |
| NER/POS Tagging | seqeval | NER, POS tagging |
| Information Retrieval | ranking-metrics, cute_ranking | |
| mir_eval | Music IR | |
| Text Generation | bleurt, bertscore, moverscore | |
| Deep Research | search_evals | |
| Recommendation System | rs_metrics |
Development Workflow
| Category | Tool | Remarks |
|---|---|---|
| Experiment Management | trulens, mlflow, wandb, mlop | |
| Coding Agents | claude-code, codex, gemini-cli, qwen-code, mistral-vibe, amp, opencode, openhands | |
| vibe-kanban, conductor | Orchestration | |
| ruler, agent-rules-sync | Rules | |
| superpowers, awesome-copilot, upskill | Skills | |
| continuous-claude, claude-mem | Context Management | |
| beads | Shared Memory | |
| claudecode-telegram, takopi | Telegram bridge | |
| agent-trace | Code annotation | |
| Voice Typing | vscode-speech | |
| Frontend Feedback | agentation | |
| Linting | ruff, pylint, pycodestyle, black, pydocstyle | |
| safety, bandit, shellcheck | Vulnerabilities | |
| mypy | Type-checking | |
| CLI Formatting | rich, tabulate, prompt-toolkit | |
| Debugging | PySnooper | |
| Dependency Management | uv | |
| pip-chill | pip freeze without dependencies | |
| pipreqs | Reverse engineer requirements.txt from imports | |
| conda-pack | Export conda for offline use | |
| Documentation | mkdocs, pdoc | |
| Progress bar | fastprogress, tqdm | |
| Testing | xdoctest | Improved doctest |
| crosshair | find failure cases for functions |
Context Engineering
| Category | Tool | Remarks |
|---|---|---|
| Automatic Prompt Engineering | dspy, textgrad, adalflow, zenbase, gepa | |
| dspydantic | Pydantic models | |
| openevolve | Evolutionary code optimization | |
| System Prompts | llm-system-prompts, leaked-system-prompts, system-prompt-leaks, awesome-ai-system-prompts, cl4r1t4s, system-prompts-and-models-of-ai-tools | |
| Prompt Compression | toon, llm-lingua | |
| Function Calling | functionary | |
| Structured Output | guidance, instructor, jsonformer,lm-format-enforcer, outlines, xgrammar, lqml, fructose | |
| json-repair | Post-process broken JSON | |
| llm-scraper | webpage to json | |
| Memory | mem0, letta, memobase, memary, langmem, memoripy | |
| cognee | Memory using knowledge graphs | |
| Code Interpreter | gpt-code-interpreter, open-interpreter, codeinterpreter-api | |
| Steering Vectors | dialz, repeng |
RAG
Agents
| Category | Tool | Remarks |
|---|---|---|
| Libraries | autogen, crewai, langroid, openai-agents, pydantic-ai, marvin, metagpt, semantic-kernel | |
| agent-sdk, claude-agent-sdk, copilot-sdk | SDKs | |
| langgraph | Graphs | |
| smolagents | Code-based agent | |
| MCP | fastmcp, enrichmcp | |
| mcp-scan | scan security vulnerability | |
| Web Use | browser-use, agent-browser | |
| magnitude | Automated Testing | |
| Deep Research | deer-flow, deeptutor, deepagents | |
| Computer Use | ui-tars-desktop, cuda | |
| Sandbox | microsandbox, e2b, screnenv, sandbox-runtime, yolobox, cco, daytona, docker-sandbox | |
| Web Search API | sonar |
Finetuning
| Category | Tool | Remarks |
|---|---|---|
| LLM Finetuning | axolotl, unsloth, torchtune, peft, litgpt, llama-factory | |
| onebitllms, matmulfreellm | 1.58-bit LLMs | |
| Model Merging | mergekit, mergoo, mergenetic | |
| Multi-modal VLMs | maestro, nanovlm, mlx-vlm | |
| smol-vision | Recipes | |
| Distributed Training | metagron-lm, deepspeed, yafsdp, nanotron, fairscale, colossalai | |
| hivemind, psyche | Decentralized Training | |
| Tokenization | supertokenizer | Train multi-word BPE |
| tokendagger | faster tiktoken | |
| Classification | adaptive-classifier | Continuous Learning |
| Abliteration | heretic |
Post-training
| Category | Tool | Remarks |
|---|---|---|
| Libraries | art, trl, verl, openrlhf, atropos, retrain, nemo-rl, slime, skyrl | |
| Instruction Tuning | llava | text+image |
| airoboros | self-instruct | |
| Verifiers | verifiers | |
| Environments | harbor, openspiel, openenv, gym, prime-environments, llmgym, reasoning-gym, gem, agentgym | |
| MCTS | search-and-learn |
Voice
| Category | Tool | Remarks |
|---|---|---|
| Libraries | livekit | |
| Pre-processing | moviepy | |
| denosier | Denoising | |
| Voice Activity Detection | silero-vad, smart-turn, yamnet | |
| General models | seamless-communication | |
| Source Separation | samaudio, spleeter, nussl, open-unmix-pytorch, asteroid | |
| indicconformerasr, vistaar | Indic Languages | |
| Speech Recognition | faster-whisper, whisperx, vibevoice-asr, kaldi, speech_recognition, delta, pocketsphinx-python, deepspeech, stt, vosk | |
| Speaker Identification | whisperkitlive, resemblyzer | |
| Speech Synthesis | chatterbox, festvox, cmuflite, tts | |
| whisper-streaming | Realtime | |
| pocket-tts | CPU |
Vision
| Category | Tool | Remarks |
|---|---|---|
| Vision Language Models | CogVLM | |
| Watermarking | meta-seal | |
| Facial recognition | deepface, face_recognition, mtcnn, insightface, face-detection, terran | |
| face-alignment | Find facial landmarks | |
| Facial-Expression-Recognition.Pytorch | Face Emotion | |
| Face swapping | faceit, faceit-live, avatarify | |
| GANS | mimicry, imaginaire, pytorch-lightning-gans | |
| Image Processing | scikit-image, imutils, opencv-wrapper, opencv-python | |
| torchio | Medical Images | |
| Object detection | luminoth, detectron2, mmdetection, icevision | |
| OCR | keras-ocr, pytesseract, keras-craft, ocropy, doc2text | |
| easyocr, kraken, PaddleOCR | Multilingual OCR | |
| layout-parser, pdftabextract | Table Extraction | |
| Segmentation | segmentation_models.pytorch | Segmentation models in PyTorch |
| Pretrained models | pretrained-models.pytorch, pytorchcv, pytorch-image-models | Pre-trained ConvNets |
Inference
Monitoring
| Category | Tool | Remarks |
|---|---|---|
| Observability | openllmetry, phoenix, logfire, context-viewer | |
| gpumonitor, nvtop, jupyterlab-nvdashboard | GPU usage | |
| Alerts | knockknock, jupyter-notify, apprise, pynotifier | Notifications |
| Guardrails | openai-guardrails, giskard, langkit, garak, deepchecks, nemo-guardrails | |
| rebuff | Prompt Injection Detection | |
| curse-words, badwords, LDNOOBW, profanity | Profanity | |
| uqlm | Uncertainty Quantification | |
| Logging | loguru | |
| Drift Detection | alibi-detect, torchdrift, boxkite | Outlier and drift detection |
| ft-drift | Detect drift in OpenAI messages | |
| Edge Deployment | Tensorfow Lite, coreml, Tensorflow.js | |
| AI Detection | binoculars | |
| Testing | schemathesis | Automatic test generation from Swagger |
| mktestdocs, exdown | Test code present in markdown files | |
| Benchmarking | pytest-benchmark | Profile time in pytest |
| torchprof | Profile pytorch layers | |
| scalene, pyinstrument | Profile python code | |
| k6 | Load test API | |
| ai-benchmark | Bechmark VM on 19 different models |
Classic ML
| Category | Tool | Remarks |
|---|---|---|
| AutoML | auto-sklearn, mljar-supervised, automl-gs, pycaret, evalml | |
| lazypredict | Run all sklearn models at once | |
| tpot | genetic | |
| autocat | text-classification | |
| mindsdb, lugwig | Autogenerate ML code | |
| Active Learning | modal | |
| Anomaly detection | adtk | |
| Contrastive Learning | contrastive-learner | |
| Gradient Boosting | catboost, xgboost, ngboost, lightgbm, thunderbm | |
| Graph Neural Networks | spektral | GNN for Keras |
| Graph Embedding and Community Detection | karateclub, python-louvain, communities | |
| Hidden Markov Models | hmmlearn | |
| Interpretable Models | imodels | Models that show rules |
| Multi-view Learning | mvlearn | |
| Noisy Label Learning | cleanlab | |
| Optimization | nevergrad | Gradient Free Optimization |
| cvxpy | Convex Optimization | |
| Optimal Transport | pot, geomloss | |
| Probabilistic modeling | pomegranate, pymc3 | |
| Rule based classifier | sklearn-expertsys | |
| Self-Supervised Learning | lightly, vissl, solo-learn | Implementations of SSL models |
| self_supervised | Self-supervised models in Fast.AI | |
| Spiking Neural Network | norse | |
| Support Vector Machines | thundersvm | Run SVM on GPU |
| Survival Analysis | lifelines | |
| Feature engineering | featuretools, autopandas | |
| tsfresh, python-holidays, skits, catch22 | Time series | |
| Dimensionality reduction | fbpca, fitsne, trimap | |
| Data Cleaning | imblearn | Class Imbalance |
| category_encoders, dirty_cat | Categorical encoding | |
| missingno | Missing values | |
| Hyperparameter tuning | hyperopt, optuna, evol, talos | Libraries |
| keras-tuner | Keras | |
| hyperopt-sklearn, scikit-optimize | Bayesian Optimization | |
| sklearn-deap, sklearn-generic-opt | Evolutionary algorithm | |
| Adversarial Attack | cleverhans | General |
| foolbox | Image | |
| triggers | NLP | |
| Interpretability | eli5, lime, shap, alibi, tf-explain, treeinterpreter, pybreakdown, xai, lofo-importance, interpretML, shapash | |
| Language Interpretability Tool, transformers-interpret | transformers | |
| exbert, bertviz | BERT | |
| word2viz, whatlies | word-vectors | |
| Tabular Data | tabfpn | |
| Time series | prophet, tslearn, pyts, seglearn, cesium, stumpy, darts, gluon-ts, stldecompose, sktime | |
| atspy | Automated time-series models | |
| orion, luminaire | Anomaly detection | |
| pmdarima | ARIMA models | |
| Recommendation System | apyori | Apriori algorithm |
| implicit | Collaborative Filtering | |
| xlearn, DeepCTR, RankFM | Factorization machines (FM), and field-aware factorization machines (FFM) | |
| libmf-python | Matrix Factorization | |
| lightfm, spotlight | Popular Recsys algos | |
| CaseRecommender | Pytorch | |
| surprise | scikit-learn like API | |
| Pytorch | pytorch-summary | Keras-like summary |
| torchtyping, tsalib | Type annotation for tensors | |
| einops | Einstein Notation | |
| kornia | Computer Vision Methods | |
| nonechucks | Drop corrupt data automatically in DataLoader | |
| pytorch-optimizer | Collection of optimizers | |
| pytorch-block-sparse | Sparse matrix replacement for nn.Linear | |
| pytorch-forecasting | Time series forecasting in PyTorch lightning | |
| pytorch-lightning | Lightweight wrapper for PyTorch | |
| skorch | Wrap pytorch in scikit-learn compatible API | |
| torchcontrib | SOTA Bulding Blocks in PyTorch | |
| bitsandbytes | 8-bit optimizers for PyTorch | |
| Scikit-learn | scikit-lego, iterative-stratification | |
| iterstrat | Cross-validation for multi-label data | |
| scikit-multilearn | Multi-label classification | |
| tscv | Time-series cross-validation | |
| sparseml | Sparsification | |
| Helpers | mlxtend | Extra utilities not present in frameworks |
| Visualization | matplotlib, seaborn, pygal, plotly, plotnine | |
| yellowbrick, scikit-plot | libraries | |
| pyldavis | topics-models | |
| dtreeviz | decision-tree | |
| txtmarker | Highlight text in PDF | |
| metriculous | Visualize model performance | |
| mermaid | markdown | |
| squarify | Tree-map chart | |
| babyplots | 3D charts | |
| dl-visuals, ml-visuals, chalk | Diagrams | |
| bar_chart_race | bar chart race | |
| pandas_alive | Animated charts in pandas | |
| umap, ivis | high-dimensions | |
| bokeh, flourish-studio, mpld3 | Interactive charts | |
| netron, nn-svg | Model visualization | |
| tensor-sensor | Visualize tensors | |
| keract | Activation maps for keras | |
| keras-vis | Visualize keras models | |
| PlotNeuralNet | Latex code for drawing neural network | |
| loss-landscape-anim | Generate loss landscape of optimizer | |
| open-color | Color Schemes | |
| mplcyberpunk | Cyberpunk style | |
| chart.xkcd | XKCD style | |
| adjustText | Prevent overlap when plotting point text label |
NLP
| Category | Tool | Remarks |
|---|---|---|
| Libraries | spacy , nltk, corenlp, deeppavlov, kashgari, transformers, ernie, stanza, nlp-architect, spark-nlp, pytext, FARM | |
| headliner, txt2txt | seq2seq models | |
| Nvidia NeMo | Toolkit for ASR, NLP and TTS | |
| nlu | 1-line models for NLP | |
| pyconverse | Conversational Text Analysis | |
| booknlp | NLP for Books | |
| finetune | scikit-learn style | |
| compromise | Javascript NLP | |
| CPU-optimizations | turbo_transformers, onnx_transformers, fastT5 | |
| Preprocessing | textacy, texthero, textpipe, nlpretext | |
| JamSpell, pyhunspell, pyspellchecker, cython_hunspell, hunspell-dictionaries, autocorrect (can add more languages), symspellpy, spello (train your own spelling correction), contextualSpellCheck, neuspell, nlprule, spylls | Spelling Correction | |
| language-tool-python, gingerit, gramformer | Grammatical Error Correction | |
| ekphrasis | Pre-processing for social media texts | |
| editop | Compute edit-operations for text normalization | |
| contractions, pycontractions | Contraction Mapping | |
| truecase | Fix casing | |
| nnsplit, deepsegment, sentence-doctor, pysbd, sentence-splitter | Sentence Segmentation | |
| wordninja | Probabilistic Word Segmentation | |
| punctuator2 | Punctuation Restoration | |
| stopwords-iso | Stopwords for all languages | |
| language-check, langdetect, polyglot, pycld2, cld2, cld3, langid, lumi_language_id | Language Identification | |
| langcodes | Get language from language code | |
| neuralcoref | Coreference Resolution | |
| inflect, lemminflect, pyinflect | Inflections | |
| scrubadub | PID removal | |
| ftfy, clean-text,text-unidecode | Fix Unicode Issues | |
| fastpunct | Punctuation Restoration | |
| pyphen | Hypthenate words into syllables | |
| pypostal, mordecai, usaddress, libpostal | Parse Street Addresses | |
| geopy, geocoder, nominatim, pelias, photon, lieu | Geocoding | |
| probablepeople, python-nameparser | Parse person name | |
| python-phonenumbers | Parse phone numbers | |
| numerizer, word2number | Parse natural language number | |
| dateparser | Parse natural dates | |
| ctparse | Parse natural language time | |
| daterangeparser | Parse date ranges in natural language | |
| emoji | Handle emoji | |
| pyarabic | multilingual | |
| Tokenization | sentencepiece, youtokentome, subword-nmt | |
| sacremoses | Rule-based | |
| jieba, pkuseg | Chinese Word Segmentation | |
| kytea | Japanese word segmentation | |
| Clustering | kmodes, star-clustering, genieclust | |
| spherecluster | K-means with cosine distance | |
| sib | Sequential Information Bottleneck | |
| kneed | Automatically find number of clusters from elbow curve | |
| OptimalCluster | Automatically find optimal number of clusters | |
| gsdmm | Short-text clustering | |
| Code Switching | codeswitch | |
| Constituency Parsing | benepar, allennlp, chunk-english-fast | |
| Compact Models | mobilebert, distilbert, tinybert,BERT-of-Theseus-MNLI, MiniML | |
| Cross-lingual Embeddings | muse, laserembeddings, xlm, LaBSE | |
| transvec, vecmap | Train mapping between monolingual embeddings | |
| MuRIL | Embeddings for 17 indic languages with transliteration | |
| BPEmb | Subword Embeddings in 275 Languages | |
| piecelearn | Train own sub-word embeddings | |
| Dictionary | vocabulary | |
| Domain-specific | codebert | Code |
| clinicalbert-mimicnotes, clinicalbert-discharge-summary | Clinical Domain | |
| twitter-roberta-base | ||
| scispacy | bio-medical data | |
| blackstone | Legal text | |
| Entity Linking | dbpedia-spotlight, GENRE | |
| Entity Matching | py_entitymatching, deepmatcher | |
| Embeddings | InferSent, embedding-as-service, bert-as-service, sent2vec, sense2vec,glove-python, fse | |
| counterix | Train custom Count-based DSM | |
| embeddix | Convert word vectors format | |
| wiki2vec | Word2Vec trained on DBPedia Entities | |
| chars2vec | Character-embeddings for handling typo and slangs | |
| rank_bm25, BM25Transformer | BM25 | |
| sentence-transformers, DeCLUTR | BERT sentence embeddings | |
| conceptnet-numberbatch | Word embeddings trained with common-sense knowledge graph | |
| word2vec-twitter | Word2vec trained on twitter | |
| pymagnitude | Access word-embeddings programatically | |
| chakin | Download pre-trained word vectors | |
| zeugma | Pretrained-word embeddings as scikit-learn transformers | |
| starspace | Learn embeddings for anything | |
| svd2vec | Learn embeddings from co-occurrence | |
| all-but-the-top | Post-processing for word vectors | |
| entity-embed | Train custom embeddings for named entities | |
| Emotion Classification | goemotion-pytorch, text2emotion | |
| emosent-py | Sentiment scores for Emojis | |
| Feature Generation | homer, textstat | Readability scores |
| LexicalRichness | Lexical Richness Measure | |
| Finite State Transducer | OpenFST | |
| Gibberish Detection | nostril, gibberish-detector | |
| Grammar Induction | gitta, grasp | Generate CFG from sentences |
| Information Extraction | claucy | |
| GiveMe5W1H | Extract 5-why 1-how phrases from news | |
| spikex | Spacy pipeline for knowledge extraction | |
| Keyword extraction | rake, multi-rake, pke, phrasemachine, keybert, word2phrase | |
| pyate | Automated Term Extraction | |
| Knowledge | conceptnet-lite | |
| stanford-openie | Knowledge Graphs | |
| verbnet-parser | VerbNet parser | |
| Knowledge Distillation | textbrewer, aquvitae | |
| Language Model Scoring | lm-scorer, bertscore, kenlm, spacy_kenlm, mlm-scoring | |
| Lexical Simplification | easee | Evaluation metric |
| Morphology | unimorph | Morphology data for many languages |
| Multilingual support | polyglot, trankit | |
| inltk, indic_nlp | Indic Languages | |
| cltk | Latin / Classic languages | |
| langrank | Auto-select optimal transfer language | |
| Named Entity Recognition(NER) | spaCy , Stanford NER, sklearn-crfsuite | |
| med7 | Medical records | |
| Nearest neighbor | faiss, sparse_dot_topn, n2, autofaiss | |
| NLU | snips-nlu | |
| ParlAI | Dialogue System | |
| Paraphrasing | parrot | |
| pegasus | Question Paraphrasing | |
| paraphrase_diversity_ranker | Rank paraphrases of sentence | |
| sentaugment | Paraphrase mining | |
| Phonetics | epitran | Transliterate text into IPA |
| allosaurus | Recognize phone for 2000 languages | |
| Phonology | panphon | Generate phonological feature representations |
| phoible | Database of segment inventories for 2186 languages | |
| Probabilistic parsing | parserator | Create domain-specific parser for address, name etc. |
| Profanity detection | profanity-check | |
| Pronunciation | pronouncing | |
| Question Answering | haystack | Build end-to-end QA system |
| mcQA | Multiple Choice Question Answering | |
| TAPAS | Table Question Answering | |
| Relation Extraction | OpenNRE | |
| Search | elasticsearch-dsl, mellisearch-python, jina | Wrapper for elastic search |
| Semantic parsing | quepy | |
| Sentiment | vaderSentiment, afinn | Rule based |
| absa | Aspect Based Sentiment Analysis | |
| Spacy Extensions | spacy-pattern-builder | Generate dependency matcher patterns automatically |
| spacy_grammar | Rule-based grammar error detection | |
| role-pattern-builder | Pattern based SRL | |
| textpipeliner | Extract RDF triples | |
| tenseflow | Convert tense of sentence | |
| camphr | Wrapper to transformers, elmo, udify | |
| spleno | Domain-specific lemmatization | |
| spacy-udpipe | Use UDPipe from Spacy | |
| spacymoji | Add emoji metadata to spacy docs | |
| String match | phrase-seeker, textsearch | |
| jellyfish, fuzzy, doublemetaphone | Perform string and phonetic comparison | |
| clavier | Edit distance based on keyboard layout | |
| flashtext | Super-fast extract and replace keywords | |
| pythonverbalexpressions | Verbally describe regex | |
| commonregex | Ready-made regex for email/phone etc. | |
| textdistance, editdistance, word-mover-distance, edlib | Text distances | |
| wmd-relax | Word mover distance for spacy | |
| fuzzywuzzy, spaczz, PolyFuzz, rapidfuzz, fuzzymatcher | Fuzzy Search | |
| deduplipy, dedupe | Active-Learning based fuzzy matching | |
| recordlinkage | Record Linkage | |
| Summarization | textrank, pytldr, bert-extractive-summarizer, sumy, fast-pagerank, sumeval | |
| doc2query | Summarize document with queries | |
| summarizers | Controllable summarization | |
| insight_extractor | Extract insightful sentences from docs | |
| Text Extraction | textract (Image, Audio, PDF) | |
| Text Generation | gp2client, textgenrnn, gpt-2-simple, aitextgen | |
| markovify | Markov chain | |
| accelerated-text | Template-based generation | |
| keytotext | Keyword to Sentence Generation | |
| Transliteration | wiktra | |
| Machine Translation | MarianMT, Opus-MT, joeynmt, OpenNMT, EasyNMT, argos-translate, dl-translate | |
| googletrans, word2word, translate-python, deep_translator | Translation libraries | |
| mosesdecoder | Statistical MT | |
| apertium | RBMT | |
| translators | Free calls to multiple translation APIs | |
| giza++, fastalign, simalign, eflomal, awesome-align | Word Alignment | |
| Thesaurus | python-datamuse | |
| Toxicity Detection | detoxify | |
| Topic Modeling | gensim, guidedlda, enstop, top2vec, contextualized-topic-models, corex_topic, lda2vec, bertopic, tomotopy, ToModAPI | |
| zeroshot_topics | Zero-shot topic modeling | |
| octis | Evaluate topic models | |
| Typology | lang2vec | Compare typological features of languages |
| Visualization | stylecloud | Word Clouds |
| scattertext | Compare word usage across segments | |
| picture-text | Interactive tree-maps for hierarchical clustering | |
| ipymarkup | Visualize NER and syntax | |
| Verb Conjugation | nodebox_linguistics_extended, mlconj3 | |
| Word Sense Disambiguation | pywsd, ewiser, supwsd | |
| frame-english-fast | Verb Disambiguation | |
| Zero Shot Learning | setfit |
Misc
| Category | Tool | Remarks |
|---|---|---|
| Automation | pyuserinput, pyautogui, pynput | Control mouse and keyboard |
| Code to Maths | latexify-py, handcalcs |