Advancements to State-Space Models (SSMs) for Fisheries Science
The CANSSI Collaborative Research Team (CRT) for the Advancements to State-Space Models (SSMs) for Fisheries Science met in June 2015, at Dalhousie University, Halifax, Nova Scotia. The workshop successfully demonstrated recent theoretical and software developments for the fitting of SSMs. It also initiated new collaborations with scientists from Fisheries and Ocean Canada (DFO) as well as the Northwest Fisheries Science Center (NWFS-NOAA) around complicated stock assessment data. In November, members of the CRT presented the latest work on robust estimation of fixed parameters in SSMs at the BIRS workshop titled "Current and Future Challenges in Robust Statistics". In late November, the CRT Team Leader was an expert reviewer for the Northern Cod Framework Review for DFO in St. John's, Newfoundland. The next full meeting of the CRT will likely be held at the DFO Maurice Lamontagne Institute in May 2016.
Copula Dependence Modeling: Theory and Applications
The team had a strong representation at a copula workshop held in Oberwolfach, Germany in April 2015. E.F. Acar, C. Genest, J.G. Nešlehová, B. Rémillard and L.-P. Rivest attended and made presentations on their current research. E.F. Acar also organized an invited session to present the CRT's recent work at the Halifax SSC meeting in June 2015. Presentations were made by B. Rémillard, L.-P. Rivest, and E.F. Acar. Harry Joe was an invited speaker at the International Symposium on Dependence and Copulas 2015 held at the Institute of Statistical Mathematics, Tachikawa, Japan, in June 2015.
A competition for post-doctoral students was held and two candidates were awarded scholarships in the amount of $40,000 for one year. Caren Hasler will be working with R.V. Craiu in Toronto (co-director L.-P. Rivest) and Yue Zhao will be based at McGill, where he will work with C. Genest and J.G. Nešlehová. One M.Sc. student, Simon Chatelain, had a three-month internship at IREQ, the Hydro-Québec Research Institute, under the supervision of Luc Perreault. He worked on the validation of hydro-meteorological forecasts.
Three papers related to the CRT research project, authored by team members, were either published, or accepted for publication, in 2015:
- M.-P. Côté & C. Genest (2015). A copula-based risk aggregation model. The Canadian Journal of Statistics, 43, 60-81.
- F. Tounkara & L.-P. Rivest (2015). Mixture regression models for closed population capture-recapture data. Biometrics, 71, 721-730.
- J.-F. Quessy, L.-P. Rivest & M.-H. Toupin (2015). Semi-parametric pairwise inference methods in spatial models based on copulas. Spatial Statistics, 14, 472-490.
Modern Spectrum Methods in Time Series Analysis: Physical Science, Environmental Science and Computer Modeling
Several team members got together at AGU this spring for informal discussions.
The formal opening workshop took place on October 18-19 immediately after the Canadian Solar Workshop at Le Petite Rouge (St. Emile du Suffolk, PQ). Speakers included Alan Chave (C.Stat, WHOI/MIT) and Frank Vernon, (Scripps, UCSD).
The program has already started some collaborations; Keith Thompson (Dal), Frank Vernon (UCSD), and David Thomson (Queen's) are working together. Another group includes J.-P. St-Maurice, Sasha Koustov, both of University of Saskatchewan and Robyn Fiori of NRCAN.
There are two new graduate students and an undergraduate (Mark Tamming) at Queen's. Mark is a USRA student who spent the summer improving the program to convert the scanned magnetograms into time series and managed to get a complete 90-day time series at 1-minute time resolution from the 1890's. He was very meticulous about time accuracy. Solar P-modes are obvious in the spectrum (he learned enough about multitaper estimates to compute a spectrum as well - pretty good for a student just starting his third year!). The importance of this is that p-mode frequencies depend on solar irradiance, so this gives a new way to check the estimates of irradiance reconstructions that have been made from sunspot observations. Clearly, this team is off to a good start!
Statistical Inference for Complex Surveys with Missing Observations
This team held their first workshop at CRM on October 25 and 26. The objective of the workshop was to take stock of new developments in the field of missing survey data, to bring together some of the most active researchers in the field, and to identify the current challenges. The overall goal was to gain a collective view of recent advances in this field, to generate new ideas and training opportunities, as well as to foster interaction between members of the team.
Mohammad Ehsanul Karim of UBC receives the Statistics on Reels Participation Award for his film entitled "The Performance of Statistical Learning Approaches to Construct Inverse Probability Weights in Marginal Structural Cox Models: A Simulation-based Comparison". (Photo credit: Peter Macdonald)
Keelin Greenlaw of UVic (4th from the right) receives the CANSSI-sponsored Best Paper Award entitled "Empirical Bayes Multivariate Group-Sparse Regression for Brain Imaging". (Photo credit: Peter Macdonald)
The International Symposium in Statistics (ISS) 2015 was hosted by the Department of Mathematics and Statistics at Memorial University, and it took place at the Holiday Inn, St. John's, NL from July 6 to 8, 2015. The Canadian Statistical Sciences Institute (CANSSI) was a co-sponsor of the symposium among others. This meeting covered five specialized research themes: multi-dimensional data analysis in continuous setups; multivariate analysis for longitudinal categorical data; time series with financial and environmental applications; spatial-temporal data analysis; and familial longitudinal data analysis in semi-parametric setups. 46 delegates from many countries including Brazil, France, India, Switzerland, USA and Canada attended. The meeting was a grand success with an excellent academic program complemented by two social events: the symposium banquet and a whale and puffin watching tour.
The symposium welcome address was given by Charmaine Dean, former President of the SSC (Statistical Society of Canada) and the current Dean of Science of the University of Western Ontario. Alwell Oyet, Deputy Head, welcomed the delegates on behalf of the host department. Brajendra Sutradhar, General Chair of the symposium, welcomed all guests and provided a brief history of the past two symposiums (ISS-2009, ISS-2012) and their connection to the present symposium (ISS-2015). He thanked all sponsors, in particular Memorial University, CANSSI, and AARMS (Atlantic Association for Research in the Mathematical Sciences) for their support in organizing this meeting.
Statistical Inference, Learning and Models for Big Data
The Fields Institute for research in the mathematical sciences hosted a six-month thematic program on Statistical Inference, Models and Learning for Big Data from January to June 2015. This program focused on the study and advancement of inferential techniques for statistical learning in big data. The emphasis on inference was prompted by the urgent need for new statistical, computational, and mathematical research to address the ever increasing demands of big data.
The program committee was appointed by CANSSI, and allied events on the same topic took place across the country, at PIMS, CRM, and AARMS. The scientific program had two complementary strands. One strand emphasized inference and data in particular substantive areas: social policy, health policy, networks, and environmental science. The other focused on cross-cutting areas of mathematical, computational and statistical sciences, including statistical learning, visualization, optimization, and new inferential paradigms.
Several other activities at the Fields Institute were also focused on inference for big data, including both the 2014 and 2015 Distinguished Lecture Series in Statistics: given by Bin Yu and by Terry Speed, the Coxeter Lectures given by Michael Jordan, a workshop on Complex Spatio-Temporal Data Structures, a workshop on Big Data in Commercial and Retail Banking, a Distinguished Public Lecture by Andrew Lo, an Industrial Problem Solving Workshop, and a wildly successful and over-subscribed graduate student research day, which featured Robert Bell from ATT, Alekh Aggarwal from Google, and Kevin Murphy from Microsoft.
The training program was anchored by two graduate courses, as well as the opening workshop and boot camp (January 9 to 23); these gave graduate students and postdoctoral fellows unique exposure to cutting edge research in a wide range of areas. There were six postdoctoral fellows in residence throughout the program; PDF Einat Gil, a specialist in learning environments and technology, developed a very interesting program on big data for Grade 12 mathematics students which was piloted at a local high school for five weeks. All activities at the Fields Institute were streamed using FieldsLive, which was widely used both during the workshops and later: the archive is an invaluable research resource.
A detailed report of the Opening Conference and Boot Camp prepared by the postdoctoral fellows and long-term visitors identified a number of common challenges, and a number of common approaches. One set of challenges arises from the volume: most statistical and computational methods do not scale well, and simply fail on very large datasets. Special infrastructure, such as clusters of computers combined with parallel processing, selective sampling, and so-called divide and recombine techniques all have a role to play. Big data may be "long", involving a very large number of individual observations, or "wide", involving a very large number of measurements on a relatively small number of individuals. Both raise difficulties: the former of the need to model potentially complex dependencies, and the latter a failure of traditional statistical theory and methods. Common solutions to many of these problems include building more complex models, assuming underlying sparsity, developing non-convex optimization techniques, developing new visualization tools and developing new asymptotic theories. All of these approaches were developed and extended in the various workshops.
The Summer 2015 Newsletter of the Fields Institute presents a detailed summary of the five workshops that took place there:
- Big Data and Statistical Machine Learning
- Optimization and Matrix Methods
- Visualization for Big Data: Strategies and Principles
- Big Data in Health Policy
- Big Data for Social Policy
The Pacific Institute for the Mathematical Sciences hosted workshops on Statistical Inference for Large-Scale Data, and Big Data in Environmental Science, and the Centre de recherches mathématiques hosted a workshop on Statistical and computational challenges in networks and cybersecurity, and a summer school on Deep Learning.
On June 12 and 13, Dalhousie hosted the Closing Workshop, Statistical and Computational Analytics for Big Data. This workshop was organized jointly with the Institute for Big Data Analytics (IBDA) at Dalhousie University. Talks presented by Chipman, Gil, Grosse, Plante, Lix and Shipp were over-views of the research presented at selected workshops during the thematic program on Statistical Inference, Learning and Models for Big Data. The other talks presented research at IBDA on text mining, high-performance computing, visualization, bioinformatics, and privacy.
As with the other events in the Big Data thematic program, the diversity of fields involved in Big Data research was evident at this conference. At the heart of nearly all the presentations was a substantive data science application. For example, Stephanie Shipp gave an interesting application involving data on first responder calls by fire departments, and the optimal allocation of services. Although the applications were quite varied, a number of common themes emerged. In this context, issues such as privacy of individuals represented in databases, feasibility of computation, including high-performance computing, combination of different databases (e.g., in health policy research), including diverse data sources (geolocation, text, transactions, social media) and data visualization were discussed.
In a panel discussion on education in data science participants Chipman, Gil, Reid, Matwin and Plante addressed a series of questions about the 'ideal' data science program:
- how traditional university degrees in statistics or computer science need to be modified to provide students with sufficient skills to work with big data
- the importance of soft skills such as data exploration, collaborative projects and communication in data science
- education of primary and secondary-level students in data science
- cross-disciplinarity and partnering with subject-matter experts in data science applications
- "data strategists" vs. "data scientists"
- making room in a university curriculum for new topics: what should we be teaching less of?
One goal of the closing workshop was to 'introduce' researchers from the IBDA to the statistical sciences community, and vice versa. The second goal was to give a high level overview of the thematic program to interested researchers. We feel that both goals were successfully met with the workshop. The panel discussion was very helpful for the academic statisticians who participated, as many are involved in their institutions in introducing programs or courses in data science. Slides from the presentations are available on CANSSI's web pages.
Our website now includes a section for employment ads
. Our institutional members are welcome to send us their ads.