top of page

Finding proof for "Beyond Standard Model Physics" by using Machine Learning techniques


I hope most of you have come across the news regarding "The crack in the physics Standard Model"

INTRODUCTION

In this work i have been trying to analyze another Lhcb (after working with Higgs Boson decay data) data with an aim to find physics beyond the Standard Model. This is one of the prime example to show the "application of machine learning and data science technologies in doing fundamental science."

Discovery of Higgs boson very much explained the standard model. But still there are many physical phenomenon that can't be explained using the standard model for example neutrino oscillation, and dark matter. This has propelled the reasearchers to find models of this universe beyond the standard model.

Here through this work i will try to find some evidence using computational and data mining methods to either support the new BSM or we would need to find some new way to explain the concepts not being able to be explained using the Standard model.

We will do this by classifying a decay which should not occur as per the Standard model ie decay of a tau particle into three muaons. That clearly shows Lepton Flavour Voilations(LFV)......

DATA

Lets talk about the data first. As usual its a particle physics data so mostly they are simulated. In our case the data is simulated. The signal events have been simulated using the monte carlo simulations but the background are not simulated and has been taken from neigbourhood of signal events as per the mass of the parent particle. Thus mass is a clear discriminating feature and we always have to be cautious that our classifier doesnt classify much on the basis of mass. This is the challenge that the physics put to the data science.

AIM OF THE WORK

Now this work becomes more intersting when we think as a data scientist, as a data scientist, i need to make a classifier that doesn't classify it because of the difference in the data generation process of signal and bacground events but finds other discriminating physical evidences to do the same.

And as physicist i need to understand the ways by which we can utilise the great advancement in the field of computation in solving the puzzles of fundamental science. The way in which the data was generated to check this new hypothesis.

Like here if we happens to find the evidence of LFV, then the next thing to do would be to do further measure branching fraction of various channels to determine the future nature of BSM physics.

And in the opposite case if we couldn't find evidence of LFV, then we would improve our constraints on branching fraction for LFV decay which would help to constrain the parameter space of BSM model that would be used in the data generation process.

Now Lets come to data analysis portion, i would begin by talking about the challenges i faced:-

1) Understanding the problem and physics behind :- This is one of the hardest part as it require an understanding of SM,BSM physics, also few basic concepts about detectors and the concepts of branching fraction and all. But the good thing is it is not compulsory to solve this physics challenge to proceed further. I mean even with some small understanding of these concepts we are good to proceed further.

2) Building efficient classifer:- This is the hardest part of all this work. As we already have seen that our data already has a descriminating factor which we need to avoid(mass of the parent molecule) so building a classifier becomes tricky. Building a basic classifier wouldn't work here so i'm trying to customize some weighted ensemble machine learning methods for this purpose.

Now to find out if our classifier is classifying on the basis of new physics and not much on the basis of descriminating factors which are there due to different data generation process of background and signal events. We will run our classifier on a bacground only sample and see if it classifies background events in the mass region of signal events as signal event. If it does then try another classifier :)

3) Evaluating our result:- This portion is comparitively not so hard to implement as a data scientist but is quite hard to make sense of if we think from the point of view of a physicist.

<current status>

I am stuck since a month in some kind of missing value problem in my data. Because of which there is a bug in my code which is not allowing the classifier to work properly. Also related physics seems too complicated to master. And this is sad :( .

Hope to see you soon with further update on this work.

bottom of page