Twitter | Search | |
Xan Gregg
Software development . Creator of & . , , . Views my own.
1,402
Tweets
831
Following
676
Followers
Tweets
Xan Gregg Dec 7
Glad you eventually got it. Any mods come to mind that would have helped? More annotations? Using individual points and smaller lines to emphasize discreteness rather than pattern?
Reply Retweet Like
Xan Gregg Dec 6
Found the NC election data as JSON at where COUNTY is in 1..100 corresponding to alphabetical order. Thanks to post from .
Reply Retweet Like
Xan Gregg Dec 5
Replying to @xangregg
Here is the county-level line chart for all 12 contested districts with a few outlier patterns highlighted. Interestingly, the part of Bladen County in NC-7 is an outlier in the opposite direction as the part in NC-9.
Reply Retweet Like
Xan Gregg Dec 5
NC-9 results as a line chart (or slope chart or parallel coordinates if you prefer). Each line is a county. By-mail ballots in Bladen are under suspicion.
Reply Retweet Like
Xan Gregg retweeted
JMP Software Nov 29
See how expert, blogger & author Kaiser Fung transformed a challenging from into a more understandable one:
Reply Retweet Like
Xan Gregg Nov 28
Replying to @dataNOTdoctrine
Awesome -- thanks! I love to see the look on people's faces when I tell them I'm using biological colors for male and female.
Reply Retweet Like
Xan Gregg Nov 24
Replying to @femi123
You can start with the free online course: Statistical Thinking for Industrial Problem Solving. Then check for docs and other resources in .
Reply Retweet Like
Xan Gregg Nov 24
Check Dan's tweets for other examples. I see your point on comparing groups with different sizes when you only want to feature the shape (not the quantity). But I haven't noticed more implied structure than is already suggested by the KDE shape.
Reply Retweet Like
Xan Gregg Nov 23
Replying to @robertstats
I'm guessing that (like everyone else!) they don't know what "machine learning" means, exactly. Might get more confidence for something more specific, like, "cross validation".
Reply Retweet Like
Xan Gregg Nov 22
Replying to @danz_68
Tukey has the same goal for smoothing and seems proud that his moving median step method will remove outliers of any size: 1, 2, 999999, 4, 5, ....
Reply Retweet Like
Xan Gregg Nov 21
Replying to @danz_68
Here's Savitzky-Golay(nl=4,nr=4,m=2). It is very close to the spline (for chosen lambda). I suppose they are both using local polynomial fits.
Reply Retweet Like
Xan Gregg Nov 21
Replying to @danz_68
Thanks for the reference. Looks like I was misusing "m". My m=7 chart was really, m=2 and nl=nr=3. (nl + nr + 1 = 7) Will try a bigger window.
Reply Retweet Like
Xan Gregg Nov 21
Replying to @danz_68
Wasn't familiar with that one. Any recommended parameters? Here's my attempt at the m=5 and m=7 window examples from Wikipedia, with truncated calcs at the edges.
Reply Retweet Like
Xan Gregg Nov 21
Replying to @xangregg
His terminology denotes multi-pass smoothing steps. "3RSSH" means: 3=moving median of 3, with end-value-smoothing using extrapolation R=repeat until stable S=split plateaus and apply end-value-smoothing on each side of split (and then 3R) H=hanning=moving weighted average of 3
Reply Retweet Like
Xan Gregg Nov 21
Modulo raw data errors, my reproduction of Tukey's smoother is showing promise. It may not be worth the effort, but by relying heavily on medians, it's robust to singleton extreme values and better follows repeated extremes, like the dip around 1933. "data = smooth + rough"
Reply Retweet Like
Xan Gregg Nov 21
Replying to @xangregg
I labeled two points "table error" because the table doesn't agree with the original (US Census) data, but the graph does.
Reply Retweet Like
Xan Gregg Nov 21
Having trouble reproducing Tukey EDA smoothing, and now I see his "raw data" graph doesn't even agree with his data table. Hard to say what he used for the smoothing. Here's his graph of x's and my circles using his table.
Reply Retweet Like
Xan Gregg Nov 20
Two more WP pollution data charts: a heat map and a variation with mean lines. Showing countries with 50M+ years lost to save space.
Reply Retweet Like
Xan Gregg Nov 20
Another view of WP pollution data: ranked and stacked bars for countries with at least 5M person-years lost.
Reply Retweet Like
Xan Gregg Nov 20
Trying a couple summarizing views of the pollution data in graphic article. Treemap and packed bars.
Reply Retweet Like