It is a major area in statistics that makes use of the concept of probability

An example will help illustrate the relation between transductive-population and evidential-sample inferences. Consider the 100 widgets. Suppose an inspector has examined all 100 and discovered the following: There are 10 black widgets and 90 white widgets moreover 5 of the black widgets are defective and 9 of the white are defective. These are descriptive statistics. The inspector's challenge is to use these descriptive statistics to make some inferences. We distinguish two kinds of inferences the inspector could be called upon to make: transductive or evidential (what we have been calling “standard statistical inference”). If the inference concerns the 100 widgets, the statistics observed are population statistics. For example, suppose the inspector is shown one of the widgets and told that it is white. The inspector recalls that only 10% of white widgets were defective and predicts that it will be fine. This is clearly a kind of inductive inference in that it is not guaranteed to be correct: The inspector's past experience makes the conclusion probable but not certain. But it is a special kind of inductive inference, a transductive inference. The inspector's problem is relatively simple. After having calculated the descriptive statistic, p(defective|white) = 0.1, there is really very little work to be done. The inspector can be confident using the descriptive statistic to guide his inferences because the statistic was calculated based on the examples he is making inferences about. In a certain respect, the inspector is not even making an inference, just reporting a description of the population. To move from “9 of these 90 white widgets are defective” to “one of these white widgets has a 10% chance of being defective” to “a white widget selected at random is probably not defective” hardly seems like much of an inductive leap at all. Put slightly differently, once the inspector has solved the descriptive problem (what is p(defective|white) among the 100 widgets?) the inferential problem of making a prediction about a randomly selected widget is easy.

The inspector faces a more difficult problem when making inferences about a widget not in the “training” set, widget 101. In this case, the descriptive statistics (e.g., p(defective|white) = 0.1) are characteristics of a sample and the inspector must calculate inferential statistics to make predictions. In this case, the inspector must consider the evidential relation between his sample (of 100 widgets) and the general population (from which the new widget was drawn). Is the sample biased? Recognizing and adjusting for sample bias is a specific problem of evidential inference. It is this inferential problem that distinguishes transductive inference from evidential inference. Sample bias does not matter when making a transductive inference to one of the 100 original widgets.

Consider the problem of predicting the color of a widget identified as defective. If the defective widget was one of the 100, the prediction is clear: It is probably white. Nine of the 14 defective widgets encountered were white. If the defective widget is new, widget 101, the prediction is less clear. The 100 original widgets were mostly white. Is that true of the general population widgets or is that a bias in the sample? Unless the inspector knows about the relation between his sample and the population he cannot use the former to make predictions about the latter. However, figuring out that relation, solving this inferential problem, is irrelevant for a transductive inference. If the 100 widgets are considered the population, there is no sampling bias. In this way, transductive inference can be used as a kind of simplifying assumption for inductive inference. Transductive inference is inductive inference where sample-population relations are ignored, where sample statistics are treated as population statistics. The challenge of transductive inference is limited to developing useful descriptions (characterizing the patterns in the available data).

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128002834000010

Research and Methods

Steven C. Hayes, John T. Blackledge, in Comprehensive Clinical Psychology, 1998

3.02.3.4 Use of Statistics with Single-subject Data

For the most part, inferential statistics were designed for use in between-group comparisons. The assumptions underlying the widely accepted classical model of statistics are usually violated when statistical tests based on the model are applied to single-subject data. To begin with, presentation of conditions is not generally random in single-subject designs, and randomization is a necessary prerequisite to statistical analysis. More importantly (and more constantly), the independence of data required in classical statistics is generally not achieved when statistical analyses are applied to time-series data from a single subject (Sharpley & Alavosius, 1988). Busk and Marascuilo (1988) found, in a review of 101 baselines and 125 intervention phases from various single-subject experiments, that autocorrelations between data, in most cases, were significantly greater than zero and detectable even in cases of low statistical power. Several researchers have suggested using analyses based on a randomization task to circumvent the autocorrelation problem (Edgington, 1980; Levin, Marascuilo, & Hubert, 1978; Wampold & Furlong, 1981). For example, data from an alternating treatment design or extended complex phase change design, where the presentation of each phase is randomly determined, could be statistically analyzed by a procedure based on a randomization task. Some controversy surrounds the issue (Huitema, 1988), but the consensus seems to be that classical statistical analyses are too risky to use in individual time-series data unless at least 35–40 data points per phase are gathered (Horne, Yang, & Ware, 1982). Very few researchers have the good fortune to collect so much data.

Time-series analyses where collected data is simply used to predict subsequent behavior (Gottman, 1981; Gottman & Glass, 1978) can also be used, and is useful when such predictions are desired. However, such an analysis is not suitable for series with less than 20 points, as serial dependence and other factors will contribute to an overinflated alpha in such cases (Greenwood & Matyas, 1990). In cases where statistical analysis indicates the data is not autocorrelated, basic inferential statistical procedures such as a t-test may be used. Finally, the Box-Jenkins procedure (Box & Jenkins, 1976) can technically be used to determine the presence of a main effect based on the departure of observed data from an established pattern. However, this procedure would require a minimum of about 50 data points per phase, and thus is impractical for all but a few single-subject analyses.

In addition, most statistical procedures are of unknown utility when used with single-subject data. As most statistical procedures and interpretations of respective statistical results were derived from between-group studies, use of these procedures in single-subject designs yields ambiguous results. The meaning of a statistically significant result with a lone subject does not mean the same thing as a statistically significant result with a group, and the assumptions and evidentiary base supporting classical statistics simply dose not tell us what a significant result with a single subject means. Beyond the technical incorrectness of using nomethetic statistical approaches to ideographic data, it is apparent that such use of these statistics is of extremely limited use in guiding further research and bolstering confidence about an intervention's efficacy with an individual subject. If, for example, a statistically significant result were to be obtained in the treatment of a given client, this would tell us nothing about that treatment's efficacy with other potential clients. Moreover, data indicating a clinically significant change in a single client would be readily observable in a well-conducted and properly graphed single-subject experiment. Statistics—so necessary in detecting an overall positive effect in a group of subjects where some improved, some worsened, and some remained unchanged—would not be necessary in the case of one subject exhibiting one trend at any given time.

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0080427073001851

Statistics

Ronald Rousseau, ... Raf Guns, in Becoming Metric-Wise, 2018

4.1 Introduction

Statistical analysis can be subdivided into two parts descriptive statistics and inferential statistics. In descriptive statistics, one summarizes and graphically represents data of a sample or a whole population. In inferential statistics, one not only collects numerical data as a sample from a population but also analyzes it and, based on this analysis, draws conclusions with estimated uncertainties (i.e., by using probability theory) about the population. It goes without saying that in order to measure aspects of scientific communication and to evaluate scientific research, scientists use statistical techniques. Although hundreds of books have been written on statistics, few deal explicitly with statistics in the framework of information and library science. A basic introductory text for library professionals is Vaughan (2001), while Egghe and Rousseau (2001) is more elementary. One quarter of Introduction to Informetrics (Egghe & Rousseau, 1990) is devoted to statistics. Ding et al. (2014) contains a practical introduction to recent developments in informetrics, including statistical methods.

The term population refers to the set of entities (physical or abstract ones) about which one seeks information. The publications of scientists forming a research group, of scientists in a country, of scientists active in a scientific domain; of articles included in Scopus and published during the year 2015, are all examples of populations.

In order to investigate a population, the investigator collects data. If it is possible, the best option is to include the whole population in this investigation. Yet, it is often impossible to collect data on the whole population, so the statistician collects a representative sample. This means that a subset is collected in such a way that it provides a miniature image of the whole population. If, moreover, the sample is large enough, then a diligent analysis of the sample will lead to conclusions that are, to a large extent, also valid for the whole population. Such conclusions must be reliable, which includes that the probability to be correct must be known.

Classical inferential statistics draws samples from a population and then tries to obtain conclusions that are valid for the whole population (with a specified level of confidence). In informetrics there often are no samples, but one tries to draw conclusions based on an observed population e.g., all journals included in Scopus. Does this make sense? We will not answer this question, but refer to Section 4.14 for some useful references related to this question.

This chapter is subdivided into two main parts. In Part A, we describe some techniques from descriptive statistics, while in Part B we discuss inferential statistics, including a short introduction to the normal distribution and a few nonparametric tests. Although we will not consider parametric tests, we nevertheless briefly introduce the normal distribution as it is near-impossible to talk about statistics without in some way involving the normal distribution e.g., when talking about z-scores. Multivariate techniques are beyond the scope of this introductory book. For these we refer the reader to specialized literature.

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780081024744000042

Basics

Colleen McCue, in Data Mining and Predictive Analysis, 2007

1.2 Inferential versus Descriptive Statistics and Data Mining

Descriptive statistics, as the name implies, is the process of categorizing and describing the information. Inferential statistics, on the other hand, includes the process of analyzing a sample of data and using it to draw inferences about the population from which it was drawn. With inferential statistics, we can test hypotheses and begin to explore causal relationships within data and information. In data mining, we are looking for useful relationships in the information or models, particularly those that can be used to anticipate or predict future events. Therefore, data mining more closely resembles descriptive statistics.

It was not that long ago that the process of exploring and describing data, descriptive statistics, was seen as the necessary though unglamorous prerequisite to the more important and exciting process of inferential statistics and hypothesis testing. In many ways, though, the creative exploration of data and information associated with descriptive statistical analysis is the essence of data mining, a process that, in skilled hands, can open new horizons in data and our understanding of the world.

View chapterPurchase book

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780750677967500234

STATISTICS IN ARCHAEOLOGY

Robert D. Drennan, in Encyclopedia of Archaeology, 2008

Sampling Bias

Whenever statements are made about a population on the basis of a sample, there is at least some risk of error. The tools of inferential statistics cannot eliminate this risk, but they provide powerful ways of assessing it and working with it. Practically every analysis in archaeology (whether statistical or not) involves characterizing a larger set of things than are actually observed, so the perspectives of inferential statistics have implications for archaeological analyses that reach far beyond the quantitative contexts they were designed for. The risk of error in characterizing a population on the basis of a sample arises from two fundamentally different sources. One is that the process of selecting the sample from the population may systematically produce a sample with characteristics different from those of the population at large. This is known as ‘sampling bias’. It happens, for example, when lithic debitage is recovered by passing excavated deposits through screens with 6 mm mesh. Lithic debitage is known to include very small waste flakes, many of which will pass through mesh of this size, so the sample recovered from the screen will be systematically larger than the complete population. The mean weight of such a sample of waste flakes would be higher than that of the population as a whole, and any statement made on the basis of this sample about the weight of waste flakes in general would be inflated as a direct consequence of this sampling bias.

Precisely the same is true even in entirely nonquantitative analyses. An archaeologist might subjectively characterize the lithic technology of the Archaic period in some region as reflecting a broad application of a high degree of technical skill. If this characterization of the large, vaguely defined population consisting of all lithic artifacts produced in the region during the Archaic period is based on a sample recovered by artifact collectors who value well-made bifacial tools and never bother to keep anything as mundane as a utilized flake, then the sample is clearly biased toward well-made tools, and the breadth of application of high technical skill will be overvalued as a direct consequence of sampling bias.

Rigorously random procedures for sample selection are designed to avoid bias, and neither of the sampling procedures in the examples above is random. There are no statistical tools for removing bias from a sample once it has been selected, whether by screening deposits through large mesh or by collectors. Indeed, the tools of inferential statistics are often said to ‘assume’ that samples are unbiased, and thus to be inapplicable to the analysis of biased samples. This is not a useful way to approach statistical analysis in archaeology, because archaeologists are often forced to work with samples they know to be biased. The prescription offered in other disciplines (collect another sample with rigorously random procedures that avoid bias) is often impossible in archaeology. Fortunately, there are at least two common ways of working with biased samples. Archaeologists may want to say things about populations that would not be affected by the bias present in the available sample. It might, for example, be perfectly possible to make unbiased conclusions about the proportions of raw materials represented in lithic debitage on the basis of the screened sample discussed above. It would only be necessary to assume that different raw materials would not occur among the small waste flakes that fell through the screen in proportions very different from those among the larger flakes that would not escape the sample in this way. Biased samples can also often be accurately compared to each other if the same biased operated to the same degree in the selection of all samples to be compared. Thus, two samples of lithics recovered from 6 mm screen may show rather different flake weights. Such a difference cannot be attributed to sampling biases if the sampling biases were the same in both cases, and it is valid to say that the flakes in one sample are heavier than those in the other.

Precisely, the same principles apply to subjective and qualitative comparisons. To compare the application of technical flint-knapping skill to the production of lithic tools in the Archaic with that in the Formative, it may well be possible to work successfully with biased samples. (This is likely to be necessary in any event.) As long as the same sampling biases operated in the recovery of both Archaic and Formative artifacts, then they can be compared. If, however, the Formative tools come from systematic excavations and the Archaic ones are those accumulated by artifact collectors, then the sampling biases are different and likely to affect very strongly precisely the characteristics of interest. Such a difference in sampling biases affects any comparison based on the abundance of ‘nice’ well-made tools, whether the comparison is quantitative or not. To repeat, statistical tools cannot eliminate bias once a sample has been selected (and the utter elimination of all kinds of sampling bias from the process of archeological recovery is an unrealistic goal in any event). Statistics, however, does provide two very useful things: first, a reminder of the important effects that sampling bias can have, and, second, some useful concepts for thinking about sampling bias and how serious a worry it is in the specific context of particular observations of potential interest.

What is the concept of probability in statistics?

Probability is a number between 0 and 1 that describes the chance that a stated event will occur. An event is a specified set of outcomes of a random variable. Mutually exclusive events can occur only one at a time. Exhaustive events cover or contain all possible outcomes.

What are the uses of probability in statistics?

Probability provides information about the likelihood that something will happen. Meteorologists, for instance, use weather patterns to predict the probability of rain. In epidemiology, probability theory is used to understand the relationship between exposures and the risk of health effects.

Which branch of statistics involves probability?

Inference statistics (statistics branch) are statistical techniques that allow statisticians to utilize data from a sample to conclude, predict the behavior of a given population, and make judgments or decisions. Using descriptive statistics, inference statistics frequently talk in terms of probability.

What are the areas in which statistics is used?

It is applied in marketing, e-commerce, banking, finance, human resource, production, and information technology. In addition, this mathematical discipline has been a prominent part of research and is widely used in data mining, medicine, aerospace, robotics, psychology, and machine learning.