You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Martijn de Boer 91868bd84a
Add Spacy
2 months ago
.github/workflows Runs checklist at schedule and master only 2 months ago Create 2 months ago
LICENSE Initial commit 5 years ago Add Spacy 2 months ago

Awesome Linguistics


A curated list of anything remotely related to linguistics, sorted in alphabetical order.


Libraries, frameworks and applications useful for developing applications.

Platforms and toolkits

  • Haxe-linguistics - Early linguistical analysis and natural language processing library for Haxe.
  • Natural - General natural language tools for Node.js.
  • Natural Language ToolKit (NLTK) - The most complete platform for building Python programs to work with human language data.
  • Snowball - Snowball is a language in which stemming algorithms can be easily represented.
  • Spacy - Industrial-strength National Language Processing in Python.
  • UralicNLP - An open source Python library for processing morphologically rich and, for the most part, endangered Uralic languages. It can do morphological analysis, generation, lemmatization, disambiguation and lexical lookup for a great many Uralic languages.


Data sets


  • How To Label Data - Guide on managing large scale linguistic annotation projects.
  • Low Resource Languages - A list of resources for conservation, development, and documentation of low resource (human) languages.

On Wikipedia

On Youtube


Some of the more interesting and complete books.


Non free