Stata Tidbits

These tidbits contain bits and pieces of information I hope you find helpful to use Stata more effectively. You can receive notifications of new tidbits as they are added (via email) by clicking on the subscribe box at the left. (Every email has an unsubscribe link, making it a snap to unsubscribe.)
« The Statistical Software Components (SSC) archive | Stata help files over the internet »
Tuesday
Apr122011

SSC help files over the internet

In the tidbit from last week we saw how you can view Stata help files over the internet. Did you know that you can view the help files for programs stored in the Statistical Software Components (SSC) archive as well? (Thanks to Kit Baum for sharing this with me). Here are some examples.

As you can see from the examples, each begins with http://repec.org/help.php?c followed by the name of the program. The name of the program generally (but not always) corresponds to the name of the package.

You can use this as a means of displaying the help file for SSC programs within your web site or perhaps for sending the link of a help file to a friend. For more information about the SSC archive, see help ssc.

You can download the example data files from this tidbit (as well as all of the other tidbits) as shown below. These will download all of the example data files into the current folder on your computer. (If you have done this before, then you may need to specify net get stowdata, replace to overwrite the existing files.

net from http://www.MichaelNormanMitchell.com/storage/stowdata net get stowdata
If you have thoughts on this Stata Tidbit of the Week, you can post a comment. You can also send me an email at MichaelNormanMitchell and then the at sign and gmail dot com. If you are receiving this tidbit via email, you can find the web version at http://www.michaelnormanmitchell.com/ .

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (10)

. ssc type somersd.sthlp

gives another kind of display. Sure, it scrolls by, but that might be enough.

(If the help is .hlp, revise accordingly.)

April 13, 2011 | Unregistered CommenterNick Cox

If you want to share your .ado files easily, I wrote a program that will generate the .pkg and stata.toc file for you. It works great for sharing files on a "personal" site (e.g., what you might get from your university) or a public site like github, bitbucket, or code.google.com.

The program is callled pub2web. You can download here: http://code.google.com/p/kk-adofiles/source/browse/ (in the p folder). Here are instructions for how to use it with code.google.com.

April 13, 2011 | Unregistered Commenterkeith

I failed to get access to Keith's site using -net- and bailed out because the alternatives involve software I don't use. That's just me, but I suspect it would apply to some others too. I certainly don't want to install new software and learn how to use it just to find out what he has available.

April 14, 2011 | Unregistered CommenterNick Cox

I know. It's a pain that . net from doesn't work. You can download files manually through the web interface. Source > Browse > P > pub2web.ado > click "View raw file" on the right side of the page. Here's a list of programs: http://code.google.com/p/kk-adofiles/source/browse/stata.toc

Mercurial is a great program. I use it for all my projects and view it as indispensable for my everyday workflow. Yes, there's a learning curve, but hopefully it's not required.

April 14, 2011 | Unregistered Commenterkeith

Keith: I alerted StataCorp to your bug claim. They're working on the problem, which seems more at Google's end.

As I understand it, you are recommending that Stata programmers learn to use a new program so that they can make their stuff accessible. Hundreds of them have been doing it successfully otherwise for up to 20 years. So, they would need to have a much more detailed idea of what advantages your method has over any others. Also, so long as it is true that -net- can't see your programs, they seem far more difficult to find out about than any on (e.g.) SSC -- which is where Michael started this off.

April 14, 2011 | Unregistered CommenterNick Cox

http://www.michaelnormanmitchell.com/stow/ssc-help-files-over-the-internet.html

I think I was confusing. Let me seperate the issues:

1. My pub2web.ado program is a Stata program. You give it a list of .ado files and it automatically generates the stata.toc and *.pck files for you. It copies all the .toc, .pkg, .ado and .hlp files into the specifie folder so they are ready to be uploaded to ANY server. You may like my program, no matter how you publish your files.

2. Mercurial is one of several revision control packages. Revision control is used by teams who develop software. Think of serious programming companis (e.g., Microsoft) or any large open-source project (R, linux, firefox, ...).

I have personally found it to be great even for projects with one programmer (me) and small teams (me and a coauthor). Revision control offers some great features. The features are far beyond just making files accessable. It's system for my every day workflow. Sharing is just one benefit. (Although it's great to be able to share with the click of a single button.)

For example, I like to "rewind" my code to view what it looked like 3 weeks ago, and see what changed since then. I even track MS Word/Excel files in revision control. There's tons of introductions to the topic on the web. I'll just point you to this nice introduction: http://betterexplained.com/articles/a-visual-guide-to-version-control/. Here is another.

People that have been programming for 20 years probably have a good system. I personally, think revision control is superior to the systems I used to use and think revision control is gaining in popularity for good reason. (Although I'm an economist and don't know very many people who use it in my field. It seems to be more popular among software developers from what I can tell.)

I use Mercurial and Git, two "competitors". TortoiseHG is a "point and click" version of Mercurial for Windows.

3. If you use a revision control software, it also has nice features to send files (or modifications to files) by email and/or upload them to a central server for sharing. Companies like Google, Bitbucket, and Github offer free hosting of public projects for Mercurial (or Git). The later two offer paid hosting of private projects. You can really use any computer (server), but these companies offer some add-on features (like "issue" ticketing systems and the website). I (automatically) "push" my code to various servers to publish code, share it with coathors, and/or back up my code.

4. You don't have to use revision control to upload code to code.google.com. You could upload them manually through your internet browser (that is what Eric Booth does).

5. Users can access the files (1) with Mercurial or (2) download them from the website. Someone from Stata emailed me and maybe someday they'll fix Stata so you can (3) download the with .net from .... too.

Sorry, this got long... I guess you asked for "detailed" and I gave it to you! :)

April 14, 2011 | Unregistered Commenterkeith

Keith: Thanks indeed for the detail.

This is mostly a matter of there are different ways of doing things, but gettiing an idea of advantages and disadvantages of each is still desirable.

I know a couple that uses software to compile their weekly shopping list. In our household we write down things we've run out of on a piece of recycled card and remember other stuff in the store. Sometimes, we forget something we didn't write down. Sometimes, our friends do that too.

Files like .toc files are simple text files. I would just use a text editor directly. You should write up your program properly in a talk or article and then people would learn about its advantages and judge whether to use it.

Stata user-programmers mostly write code that is 10 -- 100 -- 1000 lines long and mostly write by themselves or with one other person. (I find that three is often getting complicated; having a buddy with compatible style does work well.) I don't think most use revision control software. I know a Stata user who's written his own project management software in Stata and he certainly understands what he has created but it's a much bigger deal for another programmer to change project management style completely and adopt someone else's way of thinking about it. Many Stata projects involve an .ado, a help file and a test script. You add a feature, you fix a bug, you make sure that you haven't broken anything else as a side-effect; it's not complicated.

But again you should give a talk or write an article on how revision control software is a good idea for Stata programmers, so that people can judge for themselves.

That's the programmers' side. What about the users, most of whom in the Stata community are not programmers (and that's fine)? The twenty years reference was an allusion to the fact that the Stata Technical Bulletin in 1991 was the first formalised way of users sharing their programs.

I maintain that the -net- command in Stata remains the most important breakthrough.

-ssc- is just a wrapper for using -net- in accessing SSC. I was one of the original authors of -ssc-, but clearly it is now an official command. No one has ever said that everyone should use SSC.

But we've seen lots of people get messed up if they install using browsers rather than Stata. They very often install some files but not others, or put them in the wrong places. -net- was designed partly to avoid that. I'd still argue that -- unless something like a major firewall prohibits it -- users are always better off using -net- to download user-written programs.

Specifically, StataCorp were not aware that there was a problem with -net- and the Googlecode site until I reported that yesterday It is not a case of fixing Stata, but of Stata working around an awkwardness that that site presents because it doesn't support all the standard ways of downloading.

April 15, 2011 | Unregistered CommenterNick Cox

Nick,

Thanks for the reply. I hope you (or readers) found this exchange useful. I'm not too sure I want to become a Revision Control evangelist... I'll have to think about taking this further. I'm starting a new job soon... maybe I'll see what my colleagues use and will or won't convert them :)

Personally, I think the Stata community could gain a lot if they caught on to the culture (and software tools, I guess) of open source projects. Lots of Stata packages on SSC seem to be "owned" by just one person. The idea of having shared projects with multiple contributors could be really cool, but it needs some infrastructure to handle people working on the same files at the same time and multiple "branches" of a project. (e.g., if you want to get inspired, consider the "Module" development on http://drupal.org). For example, it's crazy that one guy maintains outreg2. When I found a bug, I had to email him instead of just fixing it.

To me, revision control started out as a backup system. I "get" that it might be over-complicated for some people. Maybe it's an issue of scale. For example, I have two major chapters in my dissertation. For the second chapter, I currently have 9,300 lines of Stata (do/ado) and SAS code, in addition to a bunch of other stuff. Plus, a lot of those lines were changed multiple times so my revision control records essentially have multiple times that (since I can "rewind" my code to what it looked like on any point-in-time).

I exchanged an email or two with Alan R (from Stata) and he said the problem was that Stata only can handle HTTP version 1.0 and Google uses version 1.1 (without backward compatibility). They're going to look into the issue and I've agreed to help out as much as I can.

April 15, 2011 | Unregistered Commenterkeith

I don't see why the Stata community should suddenly start behaving quite differently. As I said for a different reason, we have 20 years as a community and the predominant one author or two author pattern has produced a lot of excellent programs. My impression is that even R is similar in this respect.

But nothing depends on community-wide agreement. You or anybody else can publish your Stata code and then declare that it is open to revision by others and explain what the protocol is for revising it, respecting author rights, whatever. I am sure that you will appreciate that other Stata authors prefer to maintain their intellectual rights to their code even while they place copies in the public domain.

Thus my attitude is that there need not be a debate about this. If you have a different way of working, do it. If other people see its benefits, they will surely follow.

April 16, 2011 | Unregistered CommenterNick Cox

Dear Keith and Nick

Thank you both for sharing this exchange here. I think that this has been a very useful dialogue for the readers. I appreciate it very much.

Thanks,

Michael

April 16, 2011 | Unregistered CommenterMichael Mitchell
Editor Permission Required
You must have editing permission for this entry in order to post comments.