If you're looking for a bit of an unconventional entry point, I recommend the seminal text 'Elements of Information Theory' by T. Cover (skipping chapters like Network Information/Gaussian channel should be fine), paired with David MacKay?'s 'Information Theory, Inference and Learning Algorithms'. Both seem available online:
http://www.cs-114.org/wp-content/uploads/2015/01/Elements_of_Information_Theory_Elements.pdf
http://www.inference.org.uk/itprnn/book.pdf
They cover some fundamentals of what optimal inference looks like, why current methods work, etc (in a very abstract way by understanding Kolmogorov complexity and its theorems and in a more concrete way in MacKay?'s text). Another good theoretical partner could be the 'Learning from data' course, yet a little more applied: (also available for free)
https://work.caltech.edu/telecourse.html
Excellent lecturer/material (to give a glimpse, take lecture 6: 'Theory of Generalization -- how an infinite model can learn from a finite sample').
Afterward I would move to modern developments (deep learning, or whatever interests you), but you'll be well equipped.
reply
https://en.m.wikipedia.org/wiki/Convergent_cross_mapping
I have been using Spacy3 nightly for a while now. This is game changing.
Spacy3 practically covers 90% of NLP use-cases with near SOTA performance. The only reason to not use it would be if you are literally pushing the boundaries of NLP or building something super specialized.
Hugging Face and Spacy (also Pytorch, but duh) are saving millions of dollars in man hours for companies around the world. They've been a revelation.
reply
JPKab 12 hours ago [–]
Everything in the above paragraph sounds like a hyped overstatement. None of it is.
As someone that's worked on some rather intensive NLP implementations, Spacy 3.0 and HuggingFace? both represent the culmination of a technological leap in NLP that started a few years ago with the advent of transfer learning in NLP. The level of accessibility to the masses these libraries offer is game-changing and democratizing.
reply
-- [2]
binarymax 14 hours ago [–]
I have lots of experience with both, and I use both together for different use cases. SpaCy? fills the need of predictable/explainable pattern matching and NER - and is very fast and reasonably accurate on a CPU. Huggingface fills the need for task based prediction when you have a GPU.
reply
danieldk 12 hours ago [–]
Huggingface fills the need for task based prediction when you have a GPU.
With model distillation, you can make models that annotate hundreds of sentences per second on a single CPU with a library like Huggingface Transformers.
For instance, one of my distilled Dutch multi-task syntax models (UD POS, language-specific POS, lemmatization, morphology, dependency parsing) annotates 316 sentences per second with 4 threads on a Ryzen 3700X. This distilled model has virtually no loss in accuracy compared to the finetuned XLM-RoBERTa? base model.
I don't use Huggingface Transformers, but ported some of their implementations to Rust [1], but that should not make a big difference since all the heavy lifting happens in C++ in libtorch anyway.
tl;dr: it is not true that tranformers are only useful for GPU prediction. You can get high CPU prediction speeds with some tricks (distillation, length-based bucketing in batches, using MKL, etc.).
[1] https://github.com/tensordot/syntaxdot/tree/main/syntaxdot-t...
reply
ZeroCool?2u 13 hours ago [–]
SpaCy? and HuggingFace? fulfill practically 99% of all our needs for NLP project at work. Really incredible bodies of work.
Also, my team chat is currently filled with people being extremely stoked about the SpaCy? + FastAPI? support! Really hope FastAPI? replaces Flask sooner rather than later.
reply
langitbiru 12 hours ago [–]
So with SpaCy? 3.0, HuggingFace?, do we still have a reason to use NLTK? Or they complement each other? Right now, I lost track of the progress in NLP.
reply
gillesjacobs 9 hours ago [–]
NLTK is showing its age. In my information extraction pipelines, the heavy lifting for modelling is done by SpaCy?, AllenNLP?, and Huggingface (and Pytorch or TF ofc).
I only use NLTK since it has some base tools for low-resource languages for which noone has pretrained a transformer model or for specific NLP-related tasks. I still use their agreement metrics module, for instance. But that's about it. Dep parsing, NER, lemmatising and stemming is all better with the above mentioned packages.
reply
Object detection: