60 packages returned for Tags:"NLP"

Stanford CoreNLP provides a set of natural language analysis tools which can take raw English language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, and mark up the structure of... More information
Syn.Bot by: Syn
  • 23,665 total downloads
  • last updated 9/6/2017
  • Latest version: 2.6.0
  • Bot NLP AI OSCOVA SIML
Robust Standalone Bot development framework with support for 2 unique architectures. The framework contains OSCOVA and an official SIML interpreter. This is a portable framework and can target multiple platforms. Using the library developers can easily create intelligent Bots or integrate Artificial... More information
Stanford NER (also known as CRFClassifier) is a Java implementation of a Named Entity Recognizer. Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names. The software provides a general (arbitrary... More information
A natural language parser is a program that works out the grammatical structure of sentences, for instance, which groups of words go together (as \"phrases\") and which words are the subject or object of a verb. Probabilistic parsers use knowledge of language gained from hand-parsed sentences to try... More information
A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like 'noun-plural'.
jieba.NET Segmenter
jieba.NET支持多种分词模式,适用于不同的应用场景;可以对繁体中文分词;还可以添加自定义词典以改善特定场景下的分词。jieba.NET提供了TF-IDF和TextRank两种关键词提取算法。
AvsAn by: eamon
Find the english language indeterminate article ('a' or 'an') for a word. Based on real usage patterns extracted from the wikipedia text dump; can therefore even deal with tricky edge cases such as acronyms (FIAT vs. FAA, NASA vs. NSA) and odd symbols. (Requires .NET Core 1.0 or .NET 4.5)
Tokenization of raw text is a standard pre-processing step for many NLP tasks. For English, tokenization usually involves punctuation splitting and separation of some affixes like possessives. Other languages require more extensive token pre-processing, which is usually called segmentation.
The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks... More information