Intuition t3 d2 d3 ,d 8 中 t da Postulate:在vector space中“close together'"的 文档会talk about the same things. 用例:Query-by-example,Free Text query as vector CCF-ADL at Zhengzhou University,June 25-27,2010
Intuition Postulate: 在vector space中“close together” 的 文档会talk about the same things. t1 d2 d1 d3 d4 d5 t3 t2 θ φ 用例:Query-by-example,Free Text query as vector CCF-ADL at Zhengzhou University, June 25-27, 2010 7
Cosine similarity t3 d2 。向量d,和d,的“closeness” 可以用它们之间的夹角大 小来度量 -d ·具体的,可用cosine of the 8 angle x来计算向量相似度. 向量按长度归一化 Normalization 2 a=v∑w2=1 sim(djdk)= d d ∑e小 V∑∑暖 8
Cosine similarity 1 1 , 2 = = = M i i j d j w • 向量d1和d2的“closeness” 可以用它们之间的夹角大 小来度量 • 具体的,可用cosine of the angle x来计算向量相似度. • 向量按长度归一化 Normalization t 1 d 2 d 1 t 3 t 2 θ = = = = = M i i k M i i j M i i j i k j k j k j k w w w w d d d d sim d d 1 2 , 1 2 , 1 , , ( , ) 8
Latent Semantic Model
Latent Semantic Model
Vector Space Model:Pros Automatic selection of index terms Partial matching of queries and documents (dealing with the case where no document contains all search terms) Ranking according to similarity score (dealing with large result sets) Term weighting schemes (improves retrieval performance) ·Various extensions -Document clustering Relevance feedback(modifying query vector) Geometric foundation CCF-ADL at Zhengzhou University, 10 June25-27,2010
Vector Space Model: Pros • Automatic selection of index terms • Partial matching of queries and documents (dealing with the case where no document contains all search terms) • Ranking according to similarity score (dealing with large result sets) • Term weighting schemes (improves retrieval performance) • Various extensions – Document clustering – Relevance feedback (modifying query vector) • Geometric foundation CCF-ADL at Zhengzhou University, June 25-27, 2010 10
I guess this page is about a blackberry...? plackberry blackberry blackberry blackhemy CCF-ADL at Zhengzhou University, 11 June25-27,2010
CCF -ADL at Zhengzhou University, June 25 -27, 2010 11