Monday May 16th, at 11 AM, Distinguished Lecture Series, Title: Modeling for Analyzing Document Collection, Speaker: Mitsunori Ogihara, Department of Computer Science, University of Miami

Monday Lecture Series:
Topic: Modeling for Analyzing Document Collection
Speaker: Mitsunori Ogihara, Department of Computer Science, University of Miami
Date and time: May 16th, 11AM
Location: ITE Building, Room 325

Abstract: Topic modeling (in particular, Latent Dirichlet Analysis) is
a technique for analyzing a large collection of documents. In topic
modeling we view each document as a frequency vector over a vocabulary
and each topic as a static distribution over the vocabulary. Given a
desired number, K, of document classes, a topic modeling algorithm
attempts to estimate concurrently K static distributions and for each
document how much each K class contributes. Mathematically, this is
the problem of approximating the matrix generated by stacking the
frequency vectors into the product of two non-negative matrices, where
both the column dimension of the first matrix and the row dimension of
the second matrix are equal to K. Topic modeling is gaining popularity
recently, for analyzing large collections of documents. In this talk
I will present some examples of applying topic modeling: (1) a small
sentiment analysis of a small collection of short patient surveys,
(2) exploratory content analysis of a large collection of letters,
(3) document classification based upon topics and other linguistic
features, and (4) exploratory analysis of a large collection of
literally works. I will speak not only the exact topic modeling steps
but also all the preprocessing steps for preparing the documents for
topic modeling.

Biography: Mitsunori Ogihara is a Professor of Computer Science at the
University of Miami, Coral Gables, Florida. There he directs the Data
Mining Group in the Center for Computational Science, a university-wide
organization for providing resources and consultation for large-scale
computation. He has published three books and approximately 190 papers
in conferences and journals. He is on the editorial board for Theory of
Computing Systems and International Journal of Foundations of Computer
Science. Ogihara received a Ph.D. in Information Sciences from Tokyo
Institute of Technology in 1993 and was a tenure-track/tenured faculty
member in the Department of Computer Science at the University of
Rochester from 1994 to 2007.

%d bloggers like this: