EPSRC logo

Details of Grant 

EPSRC Reference: EP/I016945/1
Title: Coarse geometry and cohomology of large data sets
Principal Investigator: Brodzki, Professor J
Other Investigators:
Bialek, Professor JW Forster, Professor J Carr, Professor L
Fliege, Professor J MacArthur, Dr BD Dent, Dr C
Shadbolt, Professor N
Researcher Co-Investigators:
Project Partners:
Department: School of Mathematics
Organisation: University of Southampton
Scheme: Standard Research
Starts: 01 October 2011 Ends: 30 September 2015 Value (£): 592,131
EPSRC Research Topic Classifications:
Algebra & Geometry Logic & Combinatorics
Statistics & Appl. Probability
EPSRC Industrial Sector Classifications:
Financial Services
Related Grants:
Panel History:
Panel DatePanel NameOutcome
21 Sep 2010 Mathematics Underpinning Digital Economy and Energ Announced
Summary on Grant Application Form
The digital economy is founded on data. The current trend to develop intelligent, customer-led, interactive, real-time systems requires the ability to handle and interpret vast amounts of data efficiently, quickly, and with a degree of accuracy corresponding to the requirements. This point is underlined very well by two important recent developments. The Smarter Planet initiative, supported by the IBM, envisages 'instrumented, interconnected data systems' where main elements of the physical environment are equipped with sensors constantly exchanging information. Secondly, the new transparency drive of the UK government will make huge data sets available to the public, creating 'an opportunity to build innovative applications which will bring significant economic benefit'. The need for synthetic geometric methods in data analysis arises because of the large size and of high dimensionality of the sets involved. This proposal will extend recent important theoretic results to create a set of geometric and topological tools for data analysis, placing special emphasis on flexibility, efficiency, and on close alignment with potential practical applications. This is an ideal and a very exciting time to launch a project of this nature, and its results are very likely to have direct and important consequences from the point of view of initiatives mentioned above and many other possible applications. A central theme of the proposal is the study of geometric properties of large data sets at various scales, which corresponds to varying degree of 'sharpness' with which a data set is viewed. For example, in searching large numbers of digital photographs for those that contain pictures of of people one requires a different resolution than when trying to identify a specific person. This proposal offers a very exciting opportunity for developing pure mathematical methods to the point where they can be directly applied to important, difficult and timely practical problems. The proposed work is adventurous, interdisciplinary, and brings together pure and applied mathematicians, experts in OR, computer science, statistics, and energy systems. Potential for long-term practical applications will be tested in two specific areas of applications within the context of the wider Smarter Plane initiative. A main objective of the project is to develop geometric and cohomological tools of scale-dependent coarse geometry with special emphasis on applications to finite metric spaces and more specifically, to data sets. We will place strong emphasis on methods that can be developed into efficient tools for data analysis, and the research will be informed by specific problems arising from applications which range from the theoretical to the more practical. We will test the theoretical ideas and results two important cases: one, data sets arising from the UK Government's Open data initiative. Secondly, within the context of smart grids, we will consider data generated by large number of sensors monitoring various aspects of the performance of a power grid with the objective to provide an accurate matching between supply and demand.
Key Findings
We have developed a way of combining topological data analysis with spectral properties of Laplace operators. This is now being tested on genomics data sets obtained in the study of early onset breast cancer. We have provided a framework for feature selection through a combination of SVMs with topological methods. We have made progress in our understanding of complex, heterogeneous data sets, and we are investigating specific data sets arising from the Galaxy Zoo projects.
Potential use in non-academic contexts
We are investigating possible applications in specific context with a commercial partner and with the Hampshire County Council.
No information has been submitted for this grant.
Sectors submitted by the Researcher
Creative Economy; Economy; Energy; Financial Services; Healthcare; Industrial Biotechnology; Information & Communication Technologies; Pharmaceuticals & Medical Biotechnology; Security & Conflict
Project URL:  
Further Information:  
Organisation Website: http://www.soton.ac.uk