TextBlob.工具 slorla/TextBlob:Simple,Pythx eGtHu In.lus]https://github.com/sloria/TextBlob 象◆√点此接卖 。~谷歌网址大全360度索游戏中心nks里克江省公安Elsevier Edit GitHub-Ky Thirty-irst Manuscript1314为 略扩展~通截图~园超详~用网银¥网 9 Search or jump to. Pull requests Issues Marketplace Explore ◆+~圈: 目sloria/TextBlob Watch 261Star 5.461 Fork 719 <>Code ④Issues4 Pull requests3▣Projects 0画Wiki Insights Simple,Pythonic,text processing--Sentiment analysis,part-of-speech tagging,noun phrase extraction,translation,and more https://textblob.readthedocs.io/ nip nitk pattern python python-3 python-2 natural-language-processing 486 commits 7 branches 34 releases 慧19 contributors ΦMT Branch:dev New pull request Create new file Upload files Find file Congor downlood Latest commit 15feaf4 15 days ago ■docs Bump sphinx from 1.7.7 to 1.7.8 15 days ago ■test Fix incorrect test 3 months ago ■textblob Bump version and update changelog 8 months ago 目.coveragerc Run doctests against py34 4 years ago 11
TextBlob⼯具 l https://github.com/sloria/TextBlob l功能: n Tokenization (splitting text into words and sentences)(切词) n Noun phrase extraction(名词短语抽取) n Part-of-speech tagging(词性标注) n Sentiment analysis(情感分析) n Classification (Naive Bayes, Decision Tree)(分类) n Language translation and detection powered by Google Translate(语 ⾔翻译与检测) n Word and phrase frequencies(词频统计) n Spelling correction(拼写检查) n Word inflection词形变化 (singularization and pluralization 单数和复数) and lemmatization(词⼲提取) 11
中文处理工具jieba e >XPython Software Foun.https//pypi.org/project/jieba/ 急乡√点搜索 a口 D色~谷歌居址大全360漫索游戏中心Unks黑龙江容公安Elsevier Edit GitHub-Ky Thirty-First Manuscript1314》 :昌3扩展~显截图y图智群~的网银¥网游戏√ Search projects Help Donate Log in Register jieba 0.39 √Latest version pip install jieba Last released:Aug 28,2017 Chinese Words Segementation Utilities Navigation Project description Project description jieba D Release history “结巴”中文分词:做最好的Python中文分词组件 12
中⽂处理⼯具jieba l中⽂分词、词性标注⼯具 l功能: n分词(包括并⾏分词、⽀持⾃定义词典) n词性标注 n关键词提取 l结巴的安装(如pip install jieba) 12
Outline ·3.1什么是特征工程? ·3.2自然语言处理中的自动分词、词性标注及句 法分析 ·3.3向量空间模型及文本相似度计算 ·3.4相似度计算 ·3.5特征值的缩放及归一化 ·3.6特征选择 ·3.7特征降维与升维 哈尔滨工业大学计算机学院刘远超 13
Outline • 3.1 什么是特征⼯程? • 3.2 ⾃然语⾔处理中的⾃动分词、词性标注及句 法分析 • 3.3 向量空间模型及⽂本相似度计算 • 3.4 相似度计算 • 3.5 特征值的缩放及归⼀化 • 3.6 特征选择 • 3.7 特征降维与升维 哈尔滨工业大学计算机学院 刘远超 13