2005-2006 Session
October 12th 2005, 2pm to 5pm at MANDEC (Manchester Dental Education Centre), Higher Cambridge Street (tea will be served about mid-afternoon) (building 41, entrance on corner facing building 35) Joint meeting with
Manchester University's Biostats Group Theme: "Bioinformatics" NICK FIELLERGene Expression and Annotation Various forms of oligonucleotide microarrays allow direct measurement of gene expression in samples from human subjects and are made with the aim of providing insight into the biological processes of some the condition (e.g. cancer), for example which genes play key roles in its development. Typically, many thousands of genes are measured on relatively few subjects and with relatively sparse replication. From the statistical viewpoint, the major problem is the analysis of very high dimensional data with limited numbers of observations and poor replication. However, additional information is available. Most obviously there is concomitant information on the subjects themselves, including severity of condition and demographic information. Appropriate use of this will enhance statistical analysis. Less well known is the availability of information on the genes which could play a dual role in the analysis. The broad term for this information is 'annotation'. Just as subjects with common characteristics might be expected to have similar gene expression profiles it might be anticipated that genes with some common annotation feature might display similarities. A particular form of annotation is whether a gene has been referred to in connection with a biological function or disease. Text mining techniques can determine the number of such citations in a textbase relating genes to a Medical Subject Heading (i.e. MeSH category as defined in the US National Library of Medicine's controlled vocabulary used for indexing). This can provide a measure of linkage between genes. Since such information is typically extremely sparse, use of the published MeSH hierarchies of terms allows grouping of categories at various levels and hence a measure of further connections between genes. TOM NYE Uncovering evolutionary history: new methods for inferring phylogenies Evolutionary relationships between species can be represented by a tree: the leaf nodes represent extant species, interior nodes represent ancestral species, and the branch lengths indicate the extent to which species have diverged. Such trees are referred to as phylogenies, and there are are a range of different statistical methods available for inferring the phylogeny of a set of species given their DNA sequences. The first half of the talk serves as a gentle introduction to the main statistical methods used to infer phylogenies. We will then go on to look in more detail at the so-called distance matrix methods and describe some new results in this area. MAGNUS RATTRAY Propagating Measurement Uncertainty in Microarray Data Analysis High density microarrays were first introduced a decade ago and since then they have played an increasingly important role in many areas of biological and biomedical research. Microarrays can be used to simultaneously measure the concentration of many species of RNA molecules within a sample derived from a tissue of interest. This allows the expression level of tens of thousands of genes to be measured in a single experiment. However, this technology is associated with many sources of experimental uncertainty and noise. In this talk I will discuss approaches for dealing with this uncertainty. I will focus on the analysis of oligonucleotide arrays, such as the popular Affymetrix GeneChip array, which contain multiple short specific probe sequences for each target RNA. This set of probes can be used to determine an accurate estimate for the target concentration and can also be used to determine the uncertainty associated with this measurement. The measurement uncertainty can then be propagated through the downstream analysis using probabilistic methods. We show how this approach leads to improved methods combining information from replicate experiments, identifying differential expression and dimensionality reduction. Paper preprints are available from http://www.bioinf.man.ac.uk/resources/puma/December 7th 2005 at MMU Room E34, John Dalton Building (opp BBC), 4.30pm for 5.00pm Note the change from the usual roomSHIRLEY
COLEMAN (ISRU, Newcastle University) Much consultancy in industry is about hunting for faults and causes. Sometimes the evidence is clear - increase in waste, staff malaise. But there are layers to unravel before confirming such conclusions and we need to apply statistical tools and techniques to analyse and investigate. We describe some recent projects and how we meet the challenge of keeping up to date. The European Network for Business and Industrial Statistics (ENBIS) is full of fault hunters and being part of it is very helpful. The talk will describe some recent projects and outline ENBIS.' January 11th 2006 at MMU All Saints West Building, Room 2.05, 2.00 prompt to 5.30pm (tea at 3.30pm) Room 2.05 in building 2 on this mapJoint meeting with the RSS General Applications Section Speakers: TONY GARDNER-MEDWIN (University College, London) I shall challenge the conventional view that "beyond reasonable doubt" in criminal trials is a matter of a high threshold on the probability of guilt. This I contend is a necessary but not a sufficient condition for conviction. A jury will conclude that P(guilt) based on the evidence is high if, in their judgment, such evidence would arise much more frequently in relation to a guilty person than an innocent one. But even when this is true, the evidence may be expected to arise for innocent persons with non-negligible frequency. My contention is that there is then "reasonable doubt": an innocent defendant could find themselves facing such evidence, even though in nearly all such similar cases the defendant would be guilty. The defendant should be acquitted. Addressing uncertainties and probabilities conditional on the hypothesis of innocence becomes the most critical issue for a jury to address. Seldom of course is it possible to address these issues with any quantitative precision. But shifting to questions that are conditional on presumption of innocence has immediate implications for the kinds of evidence that are relevant. A criminal record is relevant to P(guilt) but not to P(evidence, if the defendant is innocent of the current allegation). Statistics for the incidence of infanticide in a population are relevant to one's judgment of P(guilt) given that a person's child has died mysteriously, but they are not relevant to one's judgment of how often such circumstances would arise for an innocent person. If "reasonable doubt" is a question about probabilities conditional on innocence, then clear rational arguments emerge for long standing legal principles about what is and is not admissible evidence, normally justified by rather vague and incomplete reference to concepts such as fairness and morality, prejudicial versus probative impact, or supposed irrelevance or incomprehension of statistical arguments. DAVID BALDING (Imperial College, London) Many legal questions revolve around establishing the relatedness of two or more individuals using DNA profile data. The use of likelihood ratios to answer such questions is now uncontroversial, and Mendel's laws of inheritance on which the likelihood calculations are based are straightforward. Yet there nevertheless remain many potential complexities. How does uncertainty about allele proportions, affect the computations? How helpful is it to have genotypes of other individuals whose relatedness to one or more of the parties is not in dispute? What if there are many individuals and many alternative hypotheses to consider? I will discuss these and related issues while focussing on a specific application to a group of individuals, each the offspring of anonymous donor insemination, who wished to know which, if any, of them had a common natural father. STEPHEN SENN (University of Glasgow) We live in a litigious age in which the public seems to have extravagant expectations as to what medicine ought to deliver and in which no 'accident' occurs without someone being to blame. What is good news for lawyers is not necessarily good news for the rest of us. I consider some implications of public expectation of proof of non-harm for patients, physicians, investigative journalists, lawyers, drug developers and regulators and statisticians. PATRICK LAYCOCK (Manchester University) I have interacted with the Law at various levels during my career as a professional statistician. In particular, I have had a part-time post for many years as a Senior Consultant with Capcis, an engineering consultancy. Many of these statistical reports concerned some legal dispute or insurance claim, often involving several parties, over the attachment of blame, or otherwise, to one or more of the parties following some industrial hiccup or catastrophe. Such disputes often drag on for many years and are typically settled 'out of court and on the day'. Except on one occasion, these are my 'second order interactions'- they have not involved an appearance in court. My 'first order interactions' - which involve court appearances - have largely burgeoned since my retirement from teaching and have mostly concerned the statistical significance, or otherwise, of Crown prosecution statements concerning the detection of Class A and Class B drugs on banknotes. I will discuss some of the statistical problems raised and techniques I have used in these cases and attempt to convey some of the pleasures and frustrations they have brought me. It is both the blessing and the tragedy of our subject area that so many people find it so useful. Meeting Organiser: Richard Boys Telephone: +44 (0)191 2227297 The RSS has established a Statistics and the Law working party chaired by Professor Colin Aitken of Edinburgh University, to address some of the main issues. February 15th 2006 at MMU, 4.30pm for 5.00pm JIM
FREEMAN, NOELIA OSES (Manchester Business School) Slot machine games have evolved beyond all recognition since the early days of simple spinning reels. Nowadays a variety of set-piece routines e.g. HI-LO, bonus game, power line .. can be combined in a seemingly endless succession of attractive designs - each with their own distinctive statistical characteristics. An overview of some of the more popular game elements - with details of corresponding solution strategies - will be presented. Noelia's talk and website March 15th at MMU, Room E29, John Dalton Building (opp BBC), 4.30pm for 5.00pm PETER
DIGGLE (Lancaster University, John Hopkins University) Model-based Geostatistics for Tropical
Disease Epidemiology Geostatistical methods are relevant when there is scientific interest in the behaviour of a spatially continuous process S(x) which is not directly observeable. Instead, spatially discrete data Yi : i = 1,..., n are available, and Yi is stochasticaly related to S(xi). Problems of this kind arise naturally in tropical disease epidemiology because complete assessment of disease incidence or prevalence in the population of interest (typically one or more developing countries) is infeasible. Instead, spatial variation in incidence or prevalence must be inferred from incomplete sampling of selected communities within the population of interest. Diggle, Moyeed and Tawn (1998) coined the phrase "model-based geostatistics" to mean the application of general principles of statistical modelling and inference to geostatistical problems. In this talk, I will review the basic ideas of model-based geostatistics and describe an application to the estimation of spatial variation in the prevalence of Loa loa (river blindness) in sub-saharan Africa. I will also outline some preliminary results on a bivariate extension of the methodology, in which prevalence data are obtained from two different survey instruments: parasitological sampling (microscopic examination of blood samples for the presence of Loa loa parasites), and RAPLOA (a cheaper but potentially less precise questionnaire-based method devised by WHO scientists). Diggle, P.J., Moyeed, R.A. and Tawn, J.A. (1998). Model-based Geostatistics
(with Discussion). Applied Statistics 47 299-350. April 26th at MMU, 4.30pm for 5.00pm(preceeded by a short AGM) DAVID FORREST (Salford University) Statistics in sport Application of statistical techniques in the analysis of sport has been wide ranging. My presentation will illustrate the variety of settings in which Statistics has been used, focusing on topics from my own research agenda such as: what determines the size of a crowd a football game and can Lancashire's failure to win the County Cricket Championship for fifty years be attributed to rainy weather? |