A machine learning model predicted probability p using input x. It is unknown how model calculates the probability.
In the example below,
We have 100 xand p values.
Can someone please show an algorithm to find all values of x for which p is 0.5.
There are two challenges
- I don't know the function
p = f(x). I don't wish to fit some smooth polynomial curves which will remove the noise. The noises are important. xvalues are discrete. So, we need to interpolate to find the desired values ofx.
library(tidyverse)
x <- c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2.0, 2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3.0,3.1,3.2,3.3,3.4,3.5,3.6,3.7,3.8,3.9,4.0,4.1, 4.2,4.3,4.4,4.5,4.6,4.7,4.8,4.9,5.0,5.1,5.2,5.3,5.4,5.5,5.6,5.7,5.8,5.9,6.0,6.1,6.2, 6.3,6.4,6.5,6.6,6.7,6.8,6.9,7.0,7.1,7.2,7.3,7.4,7.5,7.6,7.7,7.8,7.9,8.0,8.1,8.2,8.3, 8.4,8.5,8.6,8.7,8.8,8.9,9.0,9.1,9.2,9.3,9.4,9.5,9.6,9.7,9.8,9.9, 10.0)
p <- c(0.69385203,0.67153592,0.64868391,0.72205029,0.64917218,0.66818861,0.55532616,0.58631660,0.65013198,0.53695673,0.57401464,0.57812980,0.39889101,0.41922821,0.44022287,0.48610191,0.34235438,0.30877592,0.20408235,0.17221558,0.23667792,0.29237938,0.10278049,0.20981142,0.08563396,0.12080935,0.03266140,0.12362265,0.11210208,0.08364931,0.04746024,0.14754152,0.09865584,0.16588175,0.16581508,0.14036209,0.20431540,0.19971309,0.23336415,0.12444293,0.14120138,0.21566896,0.18490258,0.34261082,0.38338941,0.41828079,0.34217964,0.38137610,0.41641546,0.58767796,0.45473784,0.60015956,0.63484702,0.55080768,0.60981219,0.71217369,0.60736818,0.78073246,0.68643671,0.79230105,0.76443958,0.74410139,0.63418201,0.64126278,0.63164615,0.68326471,0.68154362,0.75890922,0.72917978,0.55839943,0.55452549,0.69419777,0.64160572,0.63205751,0.60118916,0.40162340,0.38523375,0.39309260,0.47021037,0.33391614,0.22400555,0.20929558,0.20003229,0.15848124,0.11589228,0.13326047,0.11848593,0.17024106,0.11184393,0.12506915,0.07740497,0.02548386,0.07381765,0.02610759,0.13271803,0.07034573,0.02549706,0.02503864,0.11621910,0.08636754)
tbl <- tibble(x, p)
# plot for visualization
ggplot(data = tbl,
aes(x = x,
y = p)) +
geom_line() +
geom_point() +
geom_hline(yintercept = 0.5) +
theme_bw() +
theme(aspect.ratio = 0.4)
The figure below shows that there are five roots.

