2005-2006 Session

October 12th 2005 December 7th 2005 January 11th 2006
February 15th 2006 March 15th 2006 April 26th 2006

October 12th 2005, 2pm to 5pm at MANDEC (Manchester Dental Education Centre),
Higher Cambridge Street
(tea will be served about mid-afternoon)
 (building 41, entrance on corner facing building 35)

Joint meeting with Manchester University's Biostats Group

Theme: "Bioinformatics"

NICK FIELLER

Gene Expression and Annotation

Various forms of oligonucleotide microarrays allow direct measurement of gene expression in samples from human subjects and are made with the aim of providing insight into the biological processes of some the condition (e.g. cancer), for example which genes play key roles in its development. Typically, many thousands of genes are measured on relatively few subjects and with relatively sparse replication. From the statistical viewpoint, the major problem is the analysis of very high dimensional data with limited numbers of observations and poor replication.

However, additional information is available. Most obviously there is concomitant information on the subjects themselves, including severity of condition and demographic information.  Appropriate use of this will enhance statistical analysis. Less well known is the availability of information on the genes which could play a dual role in the analysis. The broad term for this information is 'annotation'.  Just as subjects with common characteristics might be expected to have similar gene expression profiles it might be anticipated that genes with some common annotation feature might display similarities.

A particular form of annotation is whether a gene has been referred to in connection with a biological function or disease.  Text mining techniques can determine the number of such citations in a textbase relating genes to a Medical Subject Heading (i.e. MeSH category as defined in the US National Library of Medicine's controlled vocabulary used for indexing). This can provide a measure of linkage between genes.  Since such information is typically extremely sparse, use of the published MeSH hierarchies of terms allows grouping of categories at various levels and hence a measure of further connections between genes.

TOM NYE

Uncovering evolutionary history: new methods for inferring phylogenies

Evolutionary relationships between species can be represented by a tree: the leaf nodes represent extant species, interior nodes represent ancestral species, and the branch lengths indicate the extent to which species have diverged. Such trees are referred to as phylogenies, and there are are a range of different statistical methods available for inferring the phylogeny of a set of species given their DNA sequences.

The first half of the talk serves as a gentle introduction to the main statistical methods used to infer phylogenies. We will then go on to look in more detail at the so-called distance matrix methods and describe some new results in this area.

Tom's talk

MAGNUS RATTRAY

Propagating Measurement Uncertainty in Microarray Data Analysis

High density microarrays were first introduced a decade ago and since then they have played an increasingly important role in many areas of biological and biomedical research. Microarrays can be used to simultaneously measure the concentration of many species of RNA molecules within a sample derived from a tissue of interest. This allows the expression level of tens of thousands of genes to be measured in a single experiment. However, this technology is associated with many sources of experimental uncertainty and noise.

In this talk I will discuss approaches for dealing with this uncertainty. I will focus on the analysis of oligonucleotide arrays, such as the popular Affymetrix GeneChip array, which contain multiple short specific probe sequences for each target RNA.

This set of probes can be used to determine an accurate estimate for the target concentration and can also be used to determine the uncertainty associated with this measurement. The measurement uncertainty can then be propagated through the downstream analysis using probabilistic methods. We show how this approach leads to improved methods combining information from replicate experiments, identifying differential expression and dimensionality reduction. 

Magnus's talk

Paper preprints are available from http://www.bioinf.man.ac.uk/resources/puma/


December 7th 2005 at MMU Room E34, John Dalton Building (opp BBC), 4.30pm for 5.00pm

Note the change from the usual room

SHIRLEY COLEMAN (ISRU, Newcastle University)

Tales from fault hunting

Much consultancy in industry is about hunting for faults and causes.  Sometimes the evidence is clear - increase in waste, staff malaise.  But there are layers to unravel before confirming such conclusions and we need to apply statistical tools and techniques to analyse and investigate.  We describe some recent projects and how we meet the challenge of keeping up to date.

The European Network for Business and Industrial Statistics (ENBIS) is full of fault hunters and being part of it is very helpful. The talk will describe some recent projects and outline ENBIS.'

Shirley's talk


January 11th 2006 at MMU All Saints West Building, Room 2.05, 2.00 prompt to 5.30pm (tea at 3.30pm)

Room 2.05 in building 2 on this map

Joint meeting with the RSS General Applications Section

An afternoon on statistics and the law

Speakers:

TONY GARDNER-MEDWIN (University College, London)

Reasonable Doubt: What kind of probability is at issue?

I shall challenge the conventional view that "beyond reasonable doubt" in criminal trials is a matter of a high threshold on the probability of guilt. This I contend is a necessary but not a sufficient condition for conviction.

A jury will conclude that P(guilt) based on the evidence is high if, in their judgment, such evidence would arise much more frequently in relation to a guilty person than an innocent one. But even when this is true, the evidence may be expected to arise for innocent persons with non-negligible frequency.

My contention is that there is then "reasonable doubt": an innocent defendant could find themselves facing such evidence, even though in nearly all such similar cases the defendant would be guilty. The defendant should be acquitted.

Addressing uncertainties and probabilities conditional on the hypothesis of innocence becomes the most critical issue for a jury to address. Seldom of course is it possible to address these issues with any quantitative precision. But shifting to questions that are conditional on presumption of innocence has immediate implications for the kinds of evidence that are relevant. A criminal record is relevant to P(guilt) but not to P(evidence, if the defendant is innocent of the current allegation).

Statistics for the incidence of infanticide in a population are relevant to one's judgment of P(guilt) given that a person's child has died mysteriously, but they are not relevant to one's judgment of how often such circumstances would arise for an innocent person. If "reasonable doubt" is a question about probabilities conditional on innocence, then clear rational arguments emerge for long standing legal principles about what is and is not admissible evidence, normally justified by rather vague and incomplete reference to concepts such as fairness and morality, prejudicial versus probative impact, or supposed irrelevance or incomprehension of statistical arguments.

Tony's talk

DAVID BALDING (Imperial College, London)

Assessing relatedness between groups of individuals

Many legal questions revolve around establishing the relatedness of two or more individuals using DNA profile data.

The use of likelihood ratios to answer such questions is now uncontroversial, and Mendel's laws of inheritance on which the likelihood calculations are based are straightforward.

Yet there nevertheless remain many potential complexities. How does uncertainty about allele proportions, affect the computations? How helpful is it to have genotypes of other individuals whose relatedness to one or more of the parties is not in dispute? What if there are many individuals and many alternative hypotheses to consider?

I will discuss these and related issues while focussing on a specific application to a group of individuals, each the offspring of anonymous donor insemination, who wished to know which, if any, of them had a common natural father.

David's talk

STEPHEN SENN (University of Glasgow)

How much shyster do you want with your quack?

We live in a litigious age in which the public seems to have extravagant expectations as to what medicine ought to deliver and in which no 'accident' occurs without someone being to blame. What is good news for lawyers is not necessarily good news for the rest of us. I consider some implications of public expectation of proof of non-harm for patients, physicians, investigative journalists, lawyers, drug developers and regulators and statisticians.

Stephen's talk

PATRICK LAYCOCK (Manchester University)

First and Second Order interactions with the Law

I have interacted with the Law at various levels during my career as a professional statistician. In particular, I have had a part-time post for many years as a Senior Consultant with Capcis, an engineering consultancy. Many of these statistical reports concerned some legal dispute or insurance claim, often involving several parties, over the attachment of blame, or otherwise, to one or more of the parties following some industrial hiccup or catastrophe. Such disputes often drag on for many years and are typically settled 'out of court and on the day'. Except on one occasion, these are my 'second order interactions'- they have not involved an appearance in court. My 'first order interactions' - which involve court appearances - have largely burgeoned since my retirement from teaching and have mostly concerned the statistical significance, or otherwise, of Crown prosecution statements concerning the detection of Class A and Class B drugs on banknotes.

I will discuss some of the statistical problems raised and techniques I have used in these cases and attempt to convey some of the pleasures and frustrations they have brought me. It is both the blessing and the tragedy of our subject area that so many people find it so useful.

Patrick's talk

Meeting Organiser: Richard Boys Telephone: +44 (0)191 2227297

The RSS has established a Statistics and the Law working party chaired by Professor Colin Aitken of Edinburgh University, to address some of the main issues.


February 15th 2006 at MMU, 4.30pm for 5.00pm

JIM FREEMAN, NOELIA OSES (Manchester Business School)

Gaming routines in the slot machine industry: An analytical overview

Slot machine games have evolved beyond all recognition since the early days of simple spinning reels. Nowadays a variety of set-piece routines e.g. HI-LO, bonus game, power line .. can be combined in a seemingly endless succession of attractive designs - each with their own distinctive statistical characteristics.

An overview of some of the more popular game elements - with details of corresponding solution strategies - will be presented.

Noelia's talk and website


March 15th at MMU, Room E29, John Dalton Building (opp BBC), 4.30pm for 5.00pm

MEETING POSTPONED
Note the change from the usual room

PETER DIGGLE (Lancaster University, John Hopkins University)

Model-based Geostatistics for Tropical Disease Epidemiology

Geostatistical methods are relevant when there is scientific interest in the behaviour of a spatially continuous process S(x) which is not directly observeable.

Instead, spatially discrete data Yi : i = 1,..., n are available, and Yi is stochasticaly related to S(xi).

Problems of this kind arise naturally in tropical disease epidemiology because complete assessment of disease incidence or prevalence in the population of interest (typically one or more developing countries) is infeasible. Instead, spatial variation in incidence or prevalence must be inferred from incomplete sampling of selected communities within the population of interest.

Diggle, Moyeed and Tawn (1998) coined the phrase "model-based geostatistics" to mean the application of general principles of statistical modelling and inference to geostatistical problems.

In this talk, I will review the basic ideas of model-based geostatistics and describe an application to the estimation of spatial variation in the prevalence of Loa loa (river blindness) in sub-saharan Africa. I will also outline some preliminary results on a bivariate extension of the methodology, in which prevalence data are obtained from two different survey instruments: parasitological sampling (microscopic examination of blood samples for the presence of Loa loa parasites), and RAPLOA (a cheaper but potentially less precise questionnaire-based method devised by WHO scientists).

Diggle, P.J., Moyeed, R.A. and Tawn, J.A. (1998). Model-based Geostatistics (with Discussion). Applied Statistics 47 299-350.


April 26th at MMU, 4.30pm for 5.00pm(preceeded by a short AGM)

DAVID FORREST (Salford University)

Statistics in sport

Application of statistical techniques in the analysis of sport has been wide ranging. My presentation will illustrate the variety of settings in which Statistics has been used, focusing on topics from my own research agenda such as: what determines the size of a crowd a football game and can Lancashire's failure to win the County Cricket Championship for fifty years be attributed to rainy weather?

Dave's talk