<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Stuart V. Craig</title>
	<atom:link href="http://stuartvcraig.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://stuartvcraig.com</link>
	<description></description>
	<lastBuildDate>Fri, 10 May 2013 19:17:32 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>MY BASIC TWITTER BOT</title>
		<link>http://stuartvcraig.com/my-basic-twitter-bot/</link>
		<comments>http://stuartvcraig.com/my-basic-twitter-bot/#comments</comments>
		<pubDate>Wed, 13 Mar 2013 20:08:03 +0000</pubDate>
		<dc:creator>Stuart</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://stuartvcraig.com/?p=228</guid>
		<description><![CDATA[Here is a simple twitter-bot I created for fun. Every 5 minutes it searches for an instance of “I could care less” and replaces it with “I think you mean you couldn’t care less.” The bot itself is periodically down because &#8230; <a href="http://stuartvcraig.com/my-basic-twitter-bot/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><a href="https://twitter.com/semantcpedantc">Here</a> is a simple twitter-bot I created for fun. Every 5 minutes it searches for an instance of “I could care less” and replaces it with “I think you mean you couldn’t care less.” The bot itself is periodically down because it’s technically spam (even if it’s funny spam). Here’s the <a href="http://stuartvcraig.com/wp-content/uploads/2013/03/semped_sanitized.py">code</a>, which requires the <a href="http://stuartvcraig.com/code-software/code.google.com/p/python-twitter/">python-twitter</a> library, and there&#8217;s a stable section on the <a href="http://stuartvcraig.com/code-software/#semped">code &amp; software</a> page for it.</p>
]]></content:encoded>
			<wfw:commentRss>http://stuartvcraig.com/my-basic-twitter-bot/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TODAY I LIKE (3/6/12)</title>
		<link>http://stuartvcraig.com/today-i-like-3-6-12/</link>
		<comments>http://stuartvcraig.com/today-i-like-3-6-12/#comments</comments>
		<pubDate>Tue, 06 Mar 2012 16:31:03 +0000</pubDate>
		<dc:creator>Stuart</dc:creator>
				<category><![CDATA[misc]]></category>

		<guid isPermaLink="false">http://stuartvcraig.com/?p=166</guid>
		<description><![CDATA[Become a Programmer, Motherfucker The Recipes of Punchfork This article, which makes some really good points about redistribution Factual.com &#8211; a start up that is providing a harmonized API for a curated set of data from all over the web]]></description>
				<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://stuartvcraig.com/wp-content/uploads/2012/03/7337b9b7d00384ba212b351d689bd8cf_320x215.jpg"><img class="size-full wp-image-169 aligncenter" title="7337b9b7d00384ba212b351d689bd8cf_320x215" src="http://stuartvcraig.com/wp-content/uploads/2012/03/7337b9b7d00384ba212b351d689bd8cf_320x215.jpg" alt="Red velvet cake from Punchfork" width="320" height="215" /></a></p>
<ol>
<li><a href="http://programming-motherfucker.com/become.html">Become a Programmer, Motherfucker</a></li>
<li>The Recipes of <a href="http://punchfork.com/top/vegetarian">Punchfork</a></li>
<li>This <a href="http://www.cracked.com/blog/6-things-rich-people-need-to-stop-saying/">article</a>, which makes some really good points about redistribution</li>
<li><a href="http://www.factual.com/">Factual.com</a> &#8211; a start up that is<a href="http://techcrunch.com/2011/10/24/new-factual-resolve-api-will-help-clean-up-complete-location-databases/"> providing a harmonized API for a curated set of data from all over the web</a></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://stuartvcraig.com/today-i-like-3-6-12/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>STATE DISTANCE MATRIX</title>
		<link>http://stuartvcraig.com/state-distance-matrix/</link>
		<comments>http://stuartvcraig.com/state-distance-matrix/#comments</comments>
		<pubDate>Thu, 23 Feb 2012 08:33:11 +0000</pubDate>
		<dc:creator>Stuart</dc:creator>
				<category><![CDATA[distance measures]]></category>
		<category><![CDATA[geography]]></category>
		<category><![CDATA[stata]]></category>

		<guid isPermaLink="false">http://stuartvcraig.com/?p=152</guid>
		<description><![CDATA[dESCRIPTION This is a tool that I created in order to shrink estimates for population statistics in the CPS using weights which decay by distance, but could certainly be used for other purposes. The datasets contain a distance measure by: &#8230; <a href="http://stuartvcraig.com/state-distance-matrix/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<h3>dESCRIPTION</h3>
<p>This is a tool that I created in order to shrink estimates for population statistics in the CPS using weights which decay by distance, but could certainly be used for other purposes. The datasets contain a distance measure by:</p>
<p>1. the minimum number of borders one must cross to enter each other state, and</p>
<p>2. the distance from the center of each state in miles.</p>
<h3><strong>LINKS FOR DOWNLOAD</strong>:</h3>
<p><a href="http://stuartvcraig.com/wp-content/uploads/2012/02/statedistances_source.zip">Distance files for each state</a></p>
<p><a href="http://stuartvcraig.com/wp-content/uploads/2012/02/statedistances.zip">Source files used to create distance data</a></p>
<p><a href="http://stuartvcraig.com/wp-content/uploads/2012/02/statedistance.do">A browser view-able version of the code</a></p>
<p>Details below the jump. . .</p>
<p><span id="more-152"></span></p>
<h3>SOURCES AND METHOD</h3>
<p>First, I obtained a <a href="http://www.econ.umn.edu/~holmes/data/BorderData.html">dataset of borders</a> (meaning, which states border which) from Thomas J. Homles at the University of Minnesota. I then derive the matrix which calculates minimum connecting states between each pair of states (<strong>dist_border</strong>).</p>
<p>I then use coordinates from <a href="http://www.census.gov/geo/www/2010census/statearea_intpt.html">here</a>, and the <a href="http://www.meridianworlddata.com/Distance-Calculation.asp">Great Circle Distance Formula</a> to derive the distance from centroid-to-centroid in miles (<strong>dist_miles</strong>).</p>
<p>Below is an example dataset (called statedist_wy.dta), which shows the distance of each state to WY. Note that WY has a non-zero value for dist_miles because the coordinate-to-miles formula we use is sensitive to rounding error and we started with an imprecise set of coordinates. Regardless, it&#8217;s a very close approximation of distance.</p>
<pre> fips st_str dist_b~r dist_m~s</pre>
<pre> 83 WY 0 .0001181
 94 AK 3 2270.973
 82 ID 1 364.3542
 81 MT 1 298.8094
 73 OK 2 741.8647
 56 NC 4 1597.308
 34 MI 4 1094.41
 33 IL 3 970.5767
 13 VT 7 1737.133
 42 IA 2 718.0474
 57 SC 5 1568.759
 93 CA 3 751.6623
 95 HI 4 3196.948
 91 WA 2 704.915
 72 LA 4 1205.138
 88 NV 2 535.6196
 64 MS 4 1205.159
 86 AZ 2 645.6375
 15 RI 8 1831.076
 22 NJ 6 1702.5
 59 FL 5 1721.598
 71 AR 3 984.1024
 43 MO 2 852.4576
 63 AL 4 1328.673
 55 WV 4 1434.414
 32 IN 4 1119.466
 53 DC 5 1610.378
 52 MD 5 1626.708
 51 DE 6 1687.678
 47 KS 2 571.2529
 46 NE 1 407.9484
 35 WI 3 895.3058
 12 NH 8 1795.596
 31 OH 4 1289.75
 14 MA 7 1823.049
 84 CO 1 296.2103
 92 OR 2 659.2732
 87 UT 1 331.3859
 45 SD 1 378.5783
 16 CT 7 1770.395
 44 ND 2 462.6338
 23 PA 5 1527.373
 54 VA 4 1561.956
 74 TX 3 918.6798
 41 MN 2 694.559
 62 TN 3 1230.674
 58 GA 4 1492.552
 85 NM 2 596.4534
 61 KY 3 1229.216
 11 ME 9 1916.734
 21 NY 6 1607.38</pre>
<p>One word of caution is that, unlike Holmes, I <strong>include</strong> Alaska and Hawaii in my data (Alaska is said to &#8220;border&#8221; Washington and Hawaii is said to &#8220;border&#8221; California.</p>
<h3></h3>
]]></content:encoded>
			<wfw:commentRss>http://stuartvcraig.com/state-distance-matrix/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LEVENSHTEIN DISTANCE</title>
		<link>http://stuartvcraig.com/levenshtein-distance/</link>
		<comments>http://stuartvcraig.com/levenshtein-distance/#comments</comments>
		<pubDate>Tue, 21 Feb 2012 05:05:28 +0000</pubDate>
		<dc:creator>Stuart</dc:creator>
				<category><![CDATA[distance measures]]></category>
		<category><![CDATA[record linkage]]></category>
		<category><![CDATA[recursion]]></category>
		<category><![CDATA[stata]]></category>

		<guid isPermaLink="false">http://stuartvcraig.com/?p=127</guid>
		<description><![CDATA[Need a Stata function that does spellcheck? Ok, so not as good as spellcheck. The spellcheck function in MS word does a lot of checking for transpositions and probability of misspelling that this won&#8217;t do, but it&#8217;s definitely less crude than counting &#8230; <a href="http://stuartvcraig.com/levenshtein-distance/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Need a Stata function that does spellcheck? Ok, so not as good as spellcheck. The spellcheck function in MS word does a lot of checking for transpositions and probability of misspelling that this won&#8217;t do, but it&#8217;s definitely less crude than counting the position-specific differences of two strings.</p>
<p>Levenshtein Distance is a metric designed to measure similarity of two strings. Basically, Levenshtein Distance is the minimum number of additions, deletions, or replacements necessary to transform one string into another.</p>
<p>As with many math-related topics, <a href="http://en.wikipedia.org/wiki/Levenshtein_distance">Wikipedia</a> does a pretty good job of explaining the mechanics. Also, here are some <a href="http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance">implementations</a> in other languages.</p>
<p>Programming a mata function for this would be fairly easy, as would be creating a <a href="http://www.stata.com/statalist/archive/2002-08/msg00436.html">matrix</a> for each pair of words. This program uses temp variables instead, which can be a bit computationally intensive, but is good for making lots of comparisons simultaneously (after all, Stata is good for vector manipulation).</p>
<p>I wrote this program to do record linkage using names, which can be accomplished by using joinby and then comparing strings of matches.</p>
<p>Here&#8217;s the <a href="http://stuartvcraig.com/wp-content/uploads/2012/02/edistance.do">link</a> &#8211; enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://stuartvcraig.com/levenshtein-distance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WELCOME!</title>
		<link>http://stuartvcraig.com/welcome/</link>
		<comments>http://stuartvcraig.com/welcome/#comments</comments>
		<pubDate>Thu, 25 Aug 2011 21:59:49 +0000</pubDate>
		<dc:creator>Stuart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://stuartvcraig.com/?p=10</guid>
		<description><![CDATA[. . . to stuartvcraig.com!]]></description>
				<content:encoded><![CDATA[<p>. . . to stuartvcraig.com!</p>
]]></content:encoded>
			<wfw:commentRss>http://stuartvcraig.com/welcome/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
