Text Mining with MATLAB, 2nd Edition
- Length: 487 pages
- Edition: 2
- Language: English
- Publisher: Springer
- Publication Date: 2021-11-26
- ISBN-10: 3030876942
- ISBN-13: 9783030876944
- Sales Rank: #0 (See Top 100 Books)
Text Mining with MATLAB® provides a comprehensive introduction to text mining using MATLAB. It is designed to help text mining practitioners, as well as those with little-to-no experience with text mining in general, familiarize themselves with MATLAB and its complex applications.
The book is structured in three main parts: The first part, Fundamentals, introduces basic procedures and methods for manipulating and operating with text within the MATLAB programming environment. The second part of the book, Mathematical Models, is devoted to motivating, introducing, and explaining the two main paradigms of mathematical models most commonly used for representing text data: the statistical and the geometrical approach. Eventually, the third part of the book, Techniques and Applications, addresses general problems in text mining and natural language processing applications such as document categorization, document search, content analysis, summarization, question answering, and conversational systems. This second edition includes updates in line with the recently released “Text Analytics Toolbox” within the MATLAB product and introduces three new chapters and six new sections in existing ones.
All descriptions presented are supported with practical examples that are fully reproducible. Further reading, as well as additional exercises and projects, are proposed at the end of each chapter for those readers interested in conducting further experimentation.
Preface Table of Contents 1 Introduction 1.1 About Text Mining and MATLAB® 1.2 About this Book 1.3 A (very) Brief Introduction to MATLAB® 1.4 The Text Analytics Toolbox™ 1.5 Further Reading 1.6 References Part I: Fundamentals 2 Handling Text Data 2.1 Character and Character Arrays 2.2 Handling Text with Cell Arrays 2.3 Handling Text with Structures 2.4 Handling Text with String Arrays 2.5 Some Useful Functions 2.6 Further Reading 2.7 Proposed Exercises 2.8 References 3 Regular Expressions 3.1 Basic Operators for Matching Characters 3.2 Matching Sequences of Characters 3.3 Conditional Matching 3.4 Working with Tokens 3.5 Further Reading 3.6 Proposed Exercises 3.7 References 4 Basic Operations with Strings 4.1 Searching and Comparing 4.2 Replacement and Insertion 4.3 Segmentation and Concatenation 4.4 Set Operations 4.5 Further Reading 4.6 Proposed Exercises 4.7 References 5 Reading and Writing Files 5.1 Basic File Formats 5.2 Other Useful Formats 5.3 Handling Files and Directories 5.4 Further Reading 5.5 Proposed Exercises 5.6 References 6 The Structure of Language 6.1 Levels of the Linguistic Phenomena 6.2 Morphology and Syntax 6.3 Semantics and Pragmatics 6.4 Further Reading 6.5 Proposed Exercises 6.6 References Part II: Mathematical Models 7 Basic Corpus Statistics 7.1 Fundamental Properties 7.2 Word Co-occurrences 7.3 Accounting for Order 7.4 Further Reading 7.5 Proposed Exercises 7.6 Short Projects 7.7 References 8 Statistical Models 8.1 Basic n-gram Models 8.2 Discounting 8.3 Model Interpolation 8.4 Topic Models 8.5 Further Reading 8.6 Proposed Exercises 8.7 Short Projects 8.8 References 9 Geometrical Models 9.1 The Term-Document Matrix 9.2 The Vector Space Model 9.3 Association Scores and Distances 9.4 Further Reading 9.5 Proposed Exercises 9.6 Short Projects 9.7 References 10 Dimensionality Reduction 10.1 Vocabulary Pruning and Merging 10.2 The Linear Transformation Approach 10.3 Non-linear Projection Methods 10.4 Embeddings 10.5 Further Reading 10.6 Proposed Exercises 10.7 Short Projects 10.8 References Part III: Methods and Applications 11 Document Categorization 11.1 Data Collection Preparation 11.2 Unsupervised Clustering 11.3 Supervised Classification in Vector Space 11.4 Supervised Classification in Probability Space 11.5 Further Reading 11.6 Proposed Exercises 11.7 Short Projects 11.8 References 12 Document Search 12.1 Binary Search 12.2 Vector-based Search 12.3 The BM25 Ranking Function 12.4 Cross-language Search 12.5 Further Reading 12.6 Proposed Exercises 12.7 Short Projects 12.8 References 13 Content Analysis 13.1 Dimensions of Analysis 13.2 Polarity Estimation 13.3 Qualifier and Aspect Identification 13.4 Entity, Relation and Definition Extraction 13.5 Further Reading 13.6 Proposed Exercises 13.7 Short Projects 13.8 References 14 Keyword Extraction and Summarization 14.1 Keywords and Word Clouds 14.2 Text Summarization 14.3 Further Reading 14.4 Proposed Exercises 14.5 Short Projects 14.6 References 15 Question Answering and Dialogue 15.1 Question Answering 15.2 Dialogue Systems 15.3 Further Reading 15.4 Proposed Exercises 15.5 Short Projects 15.6 References
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.