The Reliability of Generating Data
- Length: 316 pages
- Edition: 1
- Language: English
- Publisher: Chapman and Hall/CRC
- Publication Date: 2022-12-23
- ISBN-10: 0367630710
- ISBN-13: 9780367630713
- Sales Rank: #0 (See Top 100 Books)
All data are the result of human actions whether by experimentations, observations, or declarations. As such, the presumption of knowing what data are about is subject to imperfections that can affect the validity of research efforts. With calls for data-based research comes the need to assure the reliability of generated data. The reliability of converting texts into analyzable data has become a burning issue in several areas. However, this issue has been met by only a few limited, and sometimes misleading measures of the extent to which data can be trusted as surrogates of the phenomena of analytical interests. The statistic proposed by the author – “Krippendorff’s Alpha” – is widely used in the social sciences, not only where human judgements are involved but also where measurements are compared.
The Reliability of Generating Data expands on the author’s seminal work in content analysis and develops methods for assessing the reliability of the kind of data that previously defied evaluations for this purpose. It opens with a discussion of the epistemology of reliable data, then presents the most basic alpha coefficient for the single-valued coding of predefined units. This largely familiar way of measuring reliability provides the platform for the succeeding chapters which start with an overview of alternative coefficients and then expand alpha one quality after another, including to cope with the reliabilities of multi-valued coding, segmenting texts into meaningful units, big data, and information retrievals. It also includes a chapter on how to diagnose and remedy imperfections and one on applicable standards, all converging on the statistical issues of the reliability of generating data.
Features:
- Provides an overview of methods for assessing the reliability of generating data
- Expands a statistic proposed by the author, already widely used in the social sciences
- Includes many easy to follow numerical examples to illustrate the measures
- Written to be useful to beginning and advanced researchers from many disciplines, notably linguistics, sociology, psychometric and educational research, and medical science.
Cover Half Title Title Page Copyright Page Table of Contents About the Author 0 How I Became Interested in Reliability Issues 0.1 The Initial Challenge 0.2 Coming to Understand the Issues 0.3 Outline of the Chapters of This Book 1 On the Epistemology of Reliable Data 1.1 When Are Data? 1.2 Four Conditions for Being Data 1.2.1 Durability and Retrievability 1.2.2 Computability, Analyzability and Distinguishability 1.2.3 Ability to Serve as Surrogates for Analytical Concerns 1.2.4 Ability to Inform Research Questions 1.2.5 Confinements to Past Happenings 1.3 Diverse Disciplinary Approaches to Reliability In Engineering, In Psychometry, Sampling Theory Measurement Theorists Interpretive Or Qualitative Scholars 1.4 My Synthesis 1.5 Five Reliabilities Stability Replicability Accuracy Surrogacy Decisiveness 1.6 The Form of Reliability Data 1.7 Conditions for Data to Be Reliability Data Proper 1.8 Relations Between Reliability and Validity 2 Simplest Cases: The Replicability of Categorizing Predefined Units 2.1 Reliability Data Generated By Two Observers 2.1.2 Observed Disagreements 2.1.3 Expected Disagreements 2.1.4 The Alpha Coefficient Generally and Applied to Nominal Data 2.1.5 A Numerical Example 2.2 The Alpha Coefficient for Binary Data 2.3 Reliability Data With Any Number of Replications 2.3.1 An Example 3 Some Basic Properties of Alpha 3.1 Alpha’s Range 3.2 Alpha’s Degrees of Freedom 3.3 Alpha’s Scale 3.4 Alpha as an Information Measure 3.5 Alpha’s Independence From the Number of Replications, Units, and Missing Data 3.6 Alpha’s Responsiveness to Small Sample Sizes 3.7 Alpha’s Reliance On Estimating the Distribution of Values in the Population of Data Whose Reliability Is in Question 3.8 Relations Between Contingencies and Coincidences 3.9 Predicting Observed Coincidences From Alpha 3.10 The Necessity of Information in the Variance of Reliability Data 4 Alpha Compared With Primarily Nominal Agreement Measures 4.1 Percent Agreement 4.2 Various Chance-Corrections of Percent Agreements 4.2.1 Bennett Et Al.’s S 4.2.2 Cohen’s Κ (Kappa) 4.2.3 Gwet’s AC1 4.2.4 Scott’s Π (Pi) and Fleiss’ K (Mistakenly Equated With Kappa) 4.3 Pearson’s Intraclass Correlation 4.4 Comparisons and Properties of Several Agreement Coefficients 5 Metric Differences Between Single-Valued Units 5.1 Reliability Data for the Replicability of Coding Predefined Units 5.2 Metric C-Alphas 5.3 Metric Differences for Single-Valued Data 5.3.1 The Nominal Metric Difference Function 5.3.2 The Ordinal Metric Difference Function 5.3.3 The Interval Metric Difference Function 5.3.4 The Ratio Metric Difference Function 5.3.5 The Polar Metric Difference Function 5.3.6 The Circular Metric Difference Function 5.3.7 Other Difference Functions 5.4 Comparative Properties of Metric Difference Functions 5.5 A Numerical Example for Two Metric Differences 6 The Quadrilogy for Single-Valued Predefined Units and Big Data 6.1 Big Binary Reliability Data 6.2 Basic Binary Conceptions of the Quadrilogy 6.2.1 The Replicability of the Process of Generating Data 6.2.2 The Accuracy of Generated Data 6.2.3 The Surrogacy of an Alternative for Coding of Data 6.2.4 The Decisiveness of Majorities 6.2.5 A Minimal Numerical Example for Comparing the Quadrilogy 6.3 Extension to Other Than Binary Data of Medium Size 6.3.1 The Replicability of the Process of Generating Data: C-Alpha Metric 6.3.2 The Accuracy of Generated Data: C-Alpha-S 6.3.3 Majorities, Modes, Medians, and Means 6.3.3.1 For Nominal Data 6.3.3.2 For Ordinal Or Ranked Data 6.3.3.3 For Interval, Ratio, and Polar Data 6.3.4 The Surrogacy of an Alternative to Coding of Data: C-Alpha-Sm 6.3.5 The Decisiveness of Majorities, Medians, and Means: C-Alpha-M 6.4 A Numerical Example Beyond the Binary 7 Multi-Valued Coding of Predefined Units 7.1 Multi-Valued Reliability Data for Replicability 7.1.1 Its Observed Coincidences 7.2 Multi-Valued Difference Functions 7.3 Multi-Valued Disagreements for Replicability 7.3.1 Observed Multi-Valued Disagreements 7.3.2 Expected Multi-Valued Disagreement 7.3.3 C-Alpha for Multi-Valued Replicability Data 7.3.4 A Numerical Example of C-Alpha 7.4 C∈C-Alpha for Individual Values in Multi-Valued Sets of Values 7.4.1 Its Observed Disagreements 7.4.2 Its Expected Disagreement 7.4.3 C∈C-Alpha for Individual Values in Sets of Values 7.5 C-Alpha-S for Multi-Valued Accuracy 7.6 -Alpha for the Replicability of Standardized Sets, Ignoring Their Sizes 7.6.1 The Numerical Example Continued 8 Partitioning Continua and Coding Relevant Segments 8.1 Replicability of Single-Valued Partitions of Continua 8.1.1 Its Reliability Data 8.1.2 Its Observed Coincidences 8.1.3 Its Observed Disagreements 8.1.4 Its Expected Coincidences 8.1.5 Its Expected Disagreements 8.1.6 U-Alpha for Partitioned Continua 8.1.7 A Numerical Example 8.2 The Family of U-Alpha Coefficients 8.2.1 Φ|u-Alpha for Distinguishing Relevant and Irrelevant Matter 8.2.2 The Numerical Example Continued 8.2.3 Cu-Alpha for Intersections of Relevant Segments With Diverse Metrics 8.2.4 The Numerical Example Continued 8.3 Relations Between the U-Alpha Family and C-Alpha 8.4 The Replicability of Multi-Valued Partitions of Continua 8.4.1 Its Reliability Data 8.4.2 Its Observed Coincidences and Disagreements 8.4.3 Its Expected Coincidences and Disagreements 8.4.4 Cu-Alpha for the Replicability of Multi-Valued Coding of Segments of Unequal Lengths 8.5 The Accuracy of Multi-Valued Partitions of Continua 8.5.1 Its Reliability Data, Numerically Exemplified 8.5.2 Its Observed Contingencies and Disagreements 8.5.3 Its Expected Contingencies and Disagreements 8.5.4 Cu-Alpha-S for the Accuracy of Multi-Valued Coding of Segments of Unequal Lengths 8.5.5 The Numerical Example Concluded 9 Preserving the Coherency of Identified Segments in Continua 9.1 Its Difference Function 9.2 Its Observed Disagreement 9.3 Its Expected Disagreement 9.4 U-Alpha for Freely Unitized and Valued Contiguous Segments 9.5 A Numerical Example 9.6 Numerical Comparisons 10 Distinctions Drawn Within Continua 10.1 The Replicability of Distinctions 10.1.1 Its Reliability Data 10.1.2 Its Difference Function 10.1.3 Its Observed Disagreement 10.1.4 Its Expected Disagreement 10.1.5 D-Alpha for the Replicability of Drawing Distinctions 10.1.6 A Numerical Example 10.2 The Accuracy of Distinctions 10.2.1 Its Observed Inaccuracy 10.2.2 Its Expected Inaccuracies 10.2.3 D-Alpha-S for the Accuracy of Drawing Distinctions 10.2.4 A Numerical Example 11 Text Mining and Information Retrieval 11.1 Precision and Recall 11.2 Reliability Data for Replications of Text Searches 11.3 C-Alpha-IR for Text and Information Retrieval 11.4 C-Alpha-R for the Reliability of Search Engines 11.5 C-Alpha-I for Researchers’ Competence of Handling Search Engines and Interpreting Their Results 11.6 Their Relations 11.7 Reliability Data for M Researchers Facing One Search Engine 12 Diagnostic Devices and Remedial Actions 12.1 Concerning Separate Variables 12.2 Concerning Individual Units of Analysis of One Variable 12.3 Minimizing the Use of Default Categories 12.4 Concerning the Use of Appropriate Metrics for a Variable 12.5 Identifying Ambiguous Categories and Values 12.6 Selecting the Most Reliable Replications, Observers, Or Judges 12.7 Maintaining the Reliability of Data While Research Is in Progress 12.8 Selecting the Most Reliable Data From Multiple Contributors 12.9 Deselecting Variables That Fall Short of the Required Reliability 12.10 Separating Default (No-Information) Categories From Those That Matter 12.11 Lumping Ambiguous Values and Preserving Informative Distinctions 12.12 Correcting Sources of Systematic Disagreements 13 Some Special Applications 13.1 Coping With Violations of the Mutual Exclusiveness of Categories 13.2 Predefined Units With Unique Categories to Be Identified 13.3 Alpha Extended to Conditions Where Units Are Ranked 13.4 Agreements Among Groups 13.5 Circumscribing and Coding Two-Dimensional Images 13.6 A Variance-Independent Agreement Coefficient 13.7 A Generalization of Percent Agreement to More Than Two Coders of Ordered Data 14 Statistical Considerations 14.1 Statistical Meanings of Α 14.1.1 The Range of Α’s Reliability Interpretations and Its Scale 14.1.2 Variance and Alpha’s Interpretability as a Reliability Coefficient 14.1.3 The “Paradox” of Inadequate Variation Revisited 14.1.4 The Effects of Unreliability On the Reportable Frequencies 14.2 Sampling Issues 14.2.1 Regarding the Number of Replications 14.2.2 Regarding the Volumes of Recording Units and Texts 14.2.3 Regarding Information About Relevant Phenomena 14.2.3.1 In Analyzable Data 14.2.3.2 In Coding Instruments 14.3 Alpha’s Distribution 14.3.1 By Approximation to the Normal Distribution 14.3.2 By Bootstrapping 15 Reliability Standards 15.1 Conditions for Agreement Measures and Standards to Apply 15.2 Reliability Standards for Single Variables: Alphaok 15.3 Reliability Standards for Complex Multi-Variable Data 15.4 The Reliability of the Results of Analyzing Imperfect Multi-Variable Data 16 Toward a General Calculus of Differences and Agreements 16.1 Its Difference Functions (Reiterated) 16.2 Multi-Variable Analysis of Differences 16.2.1 Frequencies for Multi-Variable Differences in C 16.2.2 Primary Definitions of Differences in C 16.2.3 Secondary Definitions of Differences in C: Agreements and Accounting Equations for Differences 16.2.4 Accounting Equations for Partitions of C 16.2.5 A Numerical Example 16.3 Multi-Variable Analyses of Decisiveness as a Form of Agreement 16.3.1 Means, Medians, Modes Or Majorities 16.3.2 Frequencies for the Multi-Variable Decisiveness in C 16.3.3 Primary Definitions of Decisiveness in C 16.3.4 Secondary Definition of Decisiveness in C: Accounting Equations for Agreements and Decisiveness 16.4 Concluding Remarks Appendix References Index
Donate to keep this site alive
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.