Have a dataset, df, with >10000 rows. The first 30 rows are:
>df
      ms_estimate moso_estimate         sig
1554     6.518196     0.8782018          NS
825      6.170754     5.1146423 ms and moso
709      4.164373            NA        <NA>
13025    4.269822     5.7502859 ms and moso
2269     2.905754     0.7512660          NS
6714     3.401530     3.3315667          NS
14984    2.713234            NA        <NA>
7423     1.935319    -0.5283304          NS
8453     2.123371     0.1680088          NS
906            NA     0.0382903        <NA>
14196          NA     0.0382903        <NA>
10033    2.280660     3.1261748          ms
16397    2.280660     3.1261748          ms
4647     2.159354     1.5308502          NS
5121     1.847211     0.1912870          NS
4245     1.478000     0.5877055          NS
4732     1.973196     3.0805554        moso
4733     1.973196     3.0805554        moso
14411    1.776247     0.9723628          ms
9760     1.740305    -2.3284208 ms and moso
12158    1.720102     0.9989511          NS
7741     1.758581     0.2117089          ms
14883    1.788952            NA        <NA>
2315     1.832134     0.3518875          NS
4849     1.779664    -0.2311154          NS
7266     1.226592     0.5295427          NS
7189     1.716813     0.3342551          NS
253      1.667899     0.1715527          ms
13456    1.687443     0.4861952          ms
13518    1.542558     0.5361044 ms and moso
Want to make scatter plot with 'moso_estimate' vs 'ms_estimate', and color points according to whether being significant in ms_estimate, moso_estimate, in both or none (encoded by the 'sig' variable). To avoid overplotting (of 'sig' == "NS") I need to add data in layers according to the 'sig' variable (using subset() and .() from plyr package), and with alpha = 0.2. First layer is "NS", and the last should be "ms and moso". Works fine with the code below, except that I cannot control the legends when doing it this way. Is there a way to manually set the legend colors preferably with alpha = 1. Heres the code:
g <- ggplot(data = df)
g +
     aes(x = ms_estimate, y = moso_estimate) +
     geom_point(color = "grey", shape = 20, alpha=1, aes(fill = "NS")) +
     geom_point(subset = .(sig == "ms"), color = "green", shape = 20, alpha = 0.2, aes(fill = "ms")) +
     geom_point(subset = .(sig == "moso"), color = "blue", shape = 20, alpha = 0.2, aes(fill = "moso")) +
     geom_point(subset = .(sig == "ms and moso"), color = "red", shape = 20, alpha = 1, aes(fill = "ms and moso")) +
     xlim(-5, 5) + ylim(-5,5)

 

 
     You want to do something like this instead:
You want to do something like this instead: