Ïã¸ÛÁùºÏ²Ê

XClose

Statistical Science

Home
Menu

Dr Franz Kiraly

Note: Franz Kiraly´s primary affiliation has changed to Shell UK (Shell.ai) in Jan 2020. He retains an honorary faculty position at Ïã¸ÛÁùºÏ²Ê. Due to IT problems, this home page is frozen in its 2019 state – please note that it therefore contains outdated information. Ïã¸ÛÁùºÏ²Ê ISD is working on resolving this problem. Dr Kiraly can still be contacted via his Ïã¸ÛÁùºÏ²Ê email address.

Position

Email(@ucl.ac.uk)

Honorary Lecturer 

f.kiraly

Franz Király (September 2013, small version)…
ThemesComputational Statistics
Stochastic Modelling of Complex Systems
General Theory and Methodology
Curriculum Vitae (2018/04)
 Franz Kiraly @
 Quick links
Core interestsRecent projectsProfessional rolesPhD & Internship applications
Short CVPublicationsData scientific softwareTalks, slides and videos

Core interests

As a practical statistician and machine learner, I am interested in creating a data analytics workflow which is empirically solid, quantitative, and useful in the real world.

My research aims to provide the foundations, through:

(i) studying external assessment, comparison, and validation of white-box and black-box methodology: how to empirically test whether the (black/white-box) method does what is claimed? Is it better than simpler alternatives, or better than a random guess?

(ii) theoretical analysis and practical workflow building for complex modelling tasks, e.g., in the presence of structured/hierarchical observation mechanisms or non-standard/composite data types. For example, prediction in the context of time series, spatial observations, comparisons, multiple data sources.

(iii) Design and implementation of automated modelling and model validation workflows: how to best do the above in a suitable software environment? How to use external checks to find the most suitable model, especially within a variety of trade-offs such as between accuracy, computational cost, and interpretability?

These are especially relevant in applications where usually the data and the associated scientific questions, and not a single method class is in the focus of interest; current project and collaboration domains include energy, finance, clinical health, sports and prevention.

Recent projects

Selected recent work on data scientific methodology:

Workflow design and theory for probabilistic supervised learning. Where supervised learning predicts a label, the probabilistic variant aims at predicting the uncertainty in prediction, in addition. Our work is the first to formalize this task in an entirely model-agnostic way, provides a number of theoretical insights, as well as a formal workflow design implemented in a python package, , which is sklearn for the probabilistic case.

Predictive independence testing and graphical modelling. Our work establishes a close theoretical connection between the task of testing whether variables are (conditionally) independent, and testing whether it is possible to predict one from the other (better than a certain baseline). This leads to a close link between the predictive modelling and the independence testing workflows, enabling easy multivariate independence testing.

Learning with complex data types. In cases where variables are not numbers, categories, or strings, but more complex objects such as series, images, graphs, a more abstract data storage, processing and modelling infrastructure is needed. Model composition is the natural paradigm for the latter, leading to challenges in object oriented software design. The package for python extends joint functionality of pandas and sklearn transformers for this setting (work in progress).

Selected recent application work:

Prediction and Prevention of Falls in a Neurological In-Patient Population. Falling, and associated injuries such as hip fracture, are a major strain on health and health resources, especially in the elderly or hospitalized. We are able to predict, with high accuracy in a neurological population, whether a patient is likely to fall during their stay, using only a number connecting test (the Trail making test).

Quantification and Prediction in Running Sports. Characterizing the training state of running athletes, and making predictions for race planning and training. We can predict Marathon times with an error in the order of a few minutes, and we are able to accurately summarize an athlete by three characteristic numbers.

Professional roles

I am currently holding the following professional roles which are points of contact in matters as described below:

Ïã¸ÛÁùºÏ²Ê Statistics: enterprise coordinator & MAPS enterprise board member
Internal enabling role, and point of contact for translational engagement with Ïã¸ÛÁùºÏ²Ê statistics and the MAPS faculty, especially on data science topics - e.g., data scientific consulting, courses on data analytics, statistics, machine learning/AI, commissioned research projects. MAPS faculty PoC is Jawwad Darr (Ïã¸ÛÁùºÏ²Ê MAPS faculty, Vice-Dean Enterprise)

Ïã¸ÛÁùºÏ²Ê Statistics: diversity & equality board member
Internal point of contact for diversity & equality related matters.

Ïã¸ÛÁùºÏ²Ê CoMPLEX: board member
Representing Ïã¸ÛÁùºÏ²Ê statistics on the board of Ïã¸ÛÁùºÏ²Ê CoMPLEX. Potential point of contact for academics and industry, especially on topics in the intersection of biotech/health and statistics/machine learning/AI.

Alan Turing Institute: Data Study Groups, scientific lead and coordination team member
Co-organisation of the outreach scheme, management of data scientific and technical aspects. Possible point of contact for translational engagement with the Alan Turing Institute. Main PoC are Sebastian Vollmer (Data Study Groups, Director) and Nicolas Guernion (Alan Turing Institute, Director of Partnerships).

PhD applications

I am currently accepting applications for PhD supervision, subject to on formal requirements for obtaining a graduate research degree, and supervisory limits. Applications should include a CV, a short description of your research interests, a description of your background in mathematics/statistics, data analysis, and programming, as well as a motivation statement on what you are looking for in a PhD.

PhD stipends (through grants and projects) may be available - for these, kindly apply through the official channels, e.g., through the respective funding bodies (which may depend on your citizenship), or the Ïã¸ÛÁùºÏ²Ê Human Resources portal.

Internship applications

I am accepting applications for short-term or summer internships from highly talented applicants. These are of 1-3 months length and cover subsistence plus world-wide travel expenses at the bursary rate. Initiative applications are possible, and should include an up-to-date CV, a motivation letter with description of research interests, evidence of scientific writing skills (e.g., a thesis or paper), evidence of data analytics skills (e.g., an analytics report), and/or evidence of programming skills (e.g., github account with public repositories).

Short Curriculum Vitae

At the , I have obtained my Diplomae (equivalent to MSc or MD, as regards content) in Computer Science, Mathematics, Medicine and Physics in the years 2003, 2005, 2006 and 2011; in 2008, I recieved my PhD in Medicine.

From 2007 to 2010, I have completed my PhD thesis in Mathematics on the topic of Arithmetic Geometry, under supervision of and in cooperation with in Ulm.

From 2010 to 2013, I have worked as a postdoctoral researcher in 's , at the Technische Universität Berlin, and I have been an associate member of 's , at the Freie Universität Berlin.

In 2012 I was appointed at the where I spent a total of six months, split over 2012, 2013 and 2014.

Since 2013, I am working as a lecturer (comparable to a tenured assistant professor) at .

In 2015, I have been visiting the as an AScI Visiting Fellow.

Since 2016, I am also a faculty fellow at the newly founded whose vision is to bundle and catalyze the UK's efforts in modern data science, and have been recently co-organizing its as a member of the DSG coordination team.

[Download Curriculum Vitae] (2018/04)

Publications

(the arXiv versions are usually the most up-to-date)

Preprints

Gressmann F, Király FJ, Mateen BA, Oberhauser H. Probabilistic Supervised Learning. Preprint, 105 pages, arXiv 1801.00753. 2018.

Burkart S, Király FJ. Predictive Independence Testing, Predictive Conditional Independence Testing, and Predictive Graphical Modelling. Preprint, 50 pages, arXiv 1711.05869. 2017.

Király FJ, Qian Z. Modelling Competitive Sports: Bradley-Terry-Élő Models for Supervised and On-Line Learning of Paired Competition Outcomes. Preprint, 53 pages, arXiv 1701.08055. 2017.

Mateen BA, Bussas M, Doogan C, Waller D, Saverino A, Király FJ, Playford ED. Machine Learning in Falls Prediction; A cognition-based predictor of falls for the acute neurological in-patient population. Preprint, 37 pages, arXiv 1607.07751. 2016.

Király FJ, Oberhauser H. Kernels for sequentially ordered data. Preprint, 48 pages, arXiv 1601.08169. 2016.

Király FJ, Ziehe A, Müller K-R. Learning with algebraic invariances, and the invariant kernel trick. Preprint, 17 pages, arXiv 1411.7817. 2014.

Blythe DAJ, Király FJ, Theran L. Algebraic combinatorial methods for low-rank matrix completion with application to athletic performance prediction. Preprint, 13 pages, arXiv 1406.2864. 2014.

Király FJ, Kreuzer M, Theran L. Learning with cross-kernels and Ideal PCA. Preprint, 14 pages, arXiv 1406.2646. 2014.

Király FJ, Theran L. Matroid Regression. Preprint, 16 pages, arXiv 1403.0873. 2014.

Király FJ, Ehler M. The algebraic approach to phase retrieval and explicit inversion at the identifiability threshold. Preprint, 26 pages, arXiv 1402.4053. 2014.

Király FJ, Kreuzer M, Theran L. Dual-to-kernel learning with ideals. Preprint, 15 pages, arXiv 1402.0099. 2014.

Király FJ, Rosen Z, Theran L. Algebraic matroids with graph symmetry. Preprint, 70 pages, arXiv 1312.3777. 2013.

Király FJ. Efficient orthogonal tensor decomposition, with an application to latent variable model learning. Preprint, 14 pages, arXiv 1309.3233. 2013.

Király FJ, Theran L. Coherence and sufficient sampling densities for reconstruction in compressed sensing. Preprint, 18 pages, arXiv 1302.2767. 2013.

Refereed conference publications

Király FJ, Ehler M. Algebraic reconstruction bounds and explicit inversion for phase retrieval at the identifiability threshold. Journal of Machine Learning Research Workshop & Conference Proceedings Vol.24 – Proceedings on the Seventeenth International Conference on Artificial Intelligence and Statistics. 9 pages. 2014.

Király FJ, Theran L. Obtaining error-minimizing estimates and universal entry-wise error bounds for low-rank matrix completion. Neural Information Processing Systems 2013, to appear in Proceedings. Preprint version available as arXiv 1302.5337, 14 pages. 2013.

Király FJ, Ziehe A. Approximate rank-detecting factorization of low-rank tensors. IEEE Internatioal Conference of Acoustics, Speech, and Signal Processing 2013, to appear in Proceedings. Preprint version available as arXiv 1211.7369, 5 pages. 2013.
[

Király FJ, Tomioka R. A combinatorial algebraic approach for the identifiability of low-rank matrix completion. International Conference on Machine Learning 2012. Published in ICML Proceedings, made available by ICML as arXiv 1206.4670, 8 pages. 2012.

Király FJ, Von Buenau P, Müller JS, Blythe DAJ, Meinecke FC, Müller K-R. Regression for sets of polynomial equations. Journal of Machine Learning Research Workshop & Conference Proceedings Vol.22 – Proceedings on the Fifteenth International Conference on Artificial Intelligence and Statistics, 22:628-637. 2012.
[code] (ZIP, 17,4 KB)

Király FJ, Ziehe A, Müller K-R. An algebraic method for approximate rank one factorization of rank deficient matrices. Latent Variable Analysis and Signal Separation 2012 Conference Proceedings, 272-279. 2012.

Refereed journal publications

Ioannidis K, Chamberlain SR, Treder M, Király FJ, Leppink EW, Redden SA, Stein DJ, Lochner C, Grant JE. Problematic internet use (PIU): Associations with the impulsive-compulsive spectrum. An application of machine learning in psychiatry. Accepted in Journal of Psychiatric Research. 2016.

Blythe DAJ, Király FJ. Prediction and quantification of individual athletic performance. PLoS ONE 11(6): e0157257. 2016.

Ehler M, Graef M, Király FJ. Phase retrieval using random cubatures and fusion frames of positive semidefinite matrices. Waves, Wavelets and Fractals – Advanced Analysis. Dec 2015.

Király FJ, Theran L, Tomioka R. The algebraic combinatorial approach for low-rank matrix completion. Journal of Machine Learning Research, 16(Aug):1391-1436. 2015.

Larsen P, Király FJ. Fano schemes of generic intersections and machine learning. International Journal of Algebra and Computation, Vol.24, No.17, 923-933. 2014.

Király FJ, Lütkebohmert W. Invariants of regular local rings by p-cyclic group actions. Algebra and Number Theory, Vol.7, No.1, 63-74. 2013.

Király FJ, Von Buenau P, Blythe DAJ, Meinecke FC, Müller K-R. Algebraic geometric comparison of probability distributions. Journal of Machine Learning Research 13(Mar):855-903. 2012.
[code] (ZIP, 3,8 KB)

Preprint published in the Oberwolfach Preprint Series as

Müller JS, von Bünau P, Meinecke FC, Király FJ, Müller K-R. The Stationary Subspace Analysis Toolbox. Journal of Machine Learning Research 12(Oct):3065−3069. 2011.

Kilian H-G, Kazda M, Király FJ, Kaufmann D, Kemkemer R, Bartkowiak D. On the structure-bounded growth processes in plant population. Cell Biochemistry and Biophysics 57:87-100. 2010.

Schlenk RF, Döhner K, Mack S, Stoppel M, Király F, Götze K, Hartmann F, Horst HA, Koller E, Petzer A, Grimminger W, Kobbe G, Glasmacher A, Salwender H, Kirchen H, Haase D, Kremers S, Matzdorff A, Benner A, Döhner H. Prospective evaluation of allogeneic hematopoietic stem-cell transplantation from matched related and matched unrelated donors in younger adults with high-risk Acute Myeloid Leukemia: German-Austrian trial AMLHD98A. Journal of Clinical Oncology 20;28(30):4642-4648. 2010.

Von Bünau P, Meinecke FC, Király FJ, Müller K-R. Finding stationary subspaces in multivariate time series. Physics Review Letters. 103, 214101. 2009.

Király FJ, Kletting P, Reske SN, Glatting G. Modelling radioimmunotherapy (RIT) with anti-CD45 antibody to obtain a more favourable biodistribution. Nuklearmedizin 48:113-119. 2009.

Theses

Király FJ. Wild quotient singularities of surfaces and their regular models. Doctoral dissertation, Ulm. 2010.

Király FJ. Vergleich verschiedener Postremissionsstrategien bei der akuten myeloischen Leukämie mit normalem Karyotyp. Doctoral dissertation, Ulm. 2008.

Data scientific software

Open source software in the sklearn ecosystem - contributions and collaborations are very welcome:

- extending pandas to data containers for structured, hierarchical and complext data types, and transformer interfaces compatible with the sklearn API

- machine learning toolbox for paradigm-agnostic probabilistic supervised learning, i.e., probabilistic label predictions, extends the sklearn API and provides interfaces for Bayesian toolboxes (see also section 8 of the )

- predictive conditional independence testing with a workflow interface to predictive models in sklearn (see also section 6 of the )

Past Talks: Slides and Videos

2012, June 29, 14:00-14:20, ICML 2012
University of Edinburgh, Appleton Tower, Room AT LT 2

A Combinatorial Algebraic Approach for the Identifiability of Matrix Completion

2012, April 23, 19:35-20:00, AISTATS 2012
La Palma, Los Cancajos, H10 Taburiente Playa, Las Nieves/Tenguía room

Regression for sets of polynomial equations