Visiting the Indonesian Institute of Sciences (LIPI) for Dataverse

I just got back from the Indonesian Institute of Sciences (Lembaga Ilmu Pengetahuan Indonesia or LIPI) where Sonia Barbosa and I were invited to give talks and participate in workshops in Jakarta and Bandung entitled “Managing and Sharing of Research Data to Increase Research Quality.”

LIPI recently launched a research data repository for all researchers in Indonesia called Repositori Ilmiah Nasional (RIN) at rin.lipi.go.id and last week’s workshops were oriented toward getting their researchers and librarians more familiar with data sharing in general and Dataverse in particular.

The group within LIPI that invited us is the Center for Scientific Documentation and Information (Pusat Dokumentasi Informasi Ilmiah or PDII) and Slamet Riyanto, the researcher who coordinated our visit, already blogged about the event, linking to our talks and sharing a couple pictures. We were formally invited by Sri Hartinah, who is a researcher and former head of PDII.

Our flight arrived in Jakarta the morning of Monday, May 7th. Slamet and Sjaeful Afandi picked us up at the airport and brought us to the LIPI office to meet the team from PDII who are most involved in Dataverse, have a quick tour, and get some lunch. Thank you to Sjaeful, pictured below, for picking up some gado-gado! We met Hendro Subagyo, the new head of PDII, Ekawati Marlina, Data Quality Manager, and Wasi Tri Prasetya, Division Head of Facilities for Information Access.

On Tuesday we toured the facility and met with Mego Pinandito, Deputy Chairman for Scientific Services at LIPI, who is very supportive of the new RIN repository.

Tuesday afternoon was spent training PDII librarians on Dataverse. Sonia took the stage and I answered technical questions on the side. Lunch was ayam goreng (fried chicken).

Wednesday was the workshop in Jakarta. We handed out Dataverse bags and took our seats as L. T. Handoko, Deputy Chairman for Engineering Sciences, made his opening remarks. Sonia and I gave a talk entitled Managing, Sharing and Curating Your Research Data in a Digital Environment followed by my talk entitled Analyzing Research Data Using the Dataverse Framework where I demo’ed a new addon for Dataverse called Data Explorer. Slamet gave a talk entitled Introducing Scientific Repository as Research Data Management and Analysis Tool in Indonesia but the slides are in English.

After lunch (sate and soup), the rest of the talks were in Indonesian so Sri Hartinah, Madiareni Sulaiman, and Seno Yudhanto, brought us to Jakata’s “mini garden” to see a variety of traditional houses from different parts of Indonesia. We spent a little time in Bali.

We also enjoyed some locally made coffee (from Sumatra for me).

Wednesday evening took at train to Bandung. Thursday was a holiday and we took a trip to Tangkuban Perahu, which reminded me of Yellowstone except that I didn’t eat eggs cooked in a hot spring there.

We also visited Maribaya, which had a waterfall and a zoo.

We ended our holiday with a trip to Saung Angklung Udjo where we enjoyed a “young coconut” treat while we waited for the show to start. There was a puppet show, dancing, and children playing an angklung.

On Friday we held a workshop in Bandung that was similar to Wednesday’s workshop in Jakarta. The opening remarks were made by Hendro Subagyo, the head of PDII.

One of the talks was given by Usman Muchlish from the Center for International Forestry Research (CIFOR) which runs an installation of Dataverse at data.cifor.org and since he’s a long time Dataverse user, I was happy to pick his brain about ways we can improve the software. I also enjoyed the talk about HPC by Esa Prakasa.

After the workshop we met with Tommy Hendrix, Head of Research and Development, and Penny Sylvania Putri, Head of Multimedia, about videos that LIPI is making to promote their work and inspire children to consider careers in science.

LIPI takes 30 day expeditions all over the county to conduct science and I enjoyed the video they showed us about an expedition to Sumba island so much I couldn’t help tweeting about it. You can watch it on YouTube.

They are interested in making the raw footage and photos available to the world! Tagging the videos and photos with appropriate metadata is important for them and they are considering using Dataverse.

With the workshops behind us, our friends from LIPI continued to take us sightseeing over the weekend, visiting Kawah Putih, Situ Patengan, and rice fields.

On Sunday morning we woke up in Bandung, took a train back to Jakarta, did a little shopping, and made our way to the airport for a long flight home.

Thank your very much to LIPI for inviting me and Sonia to visit your institution and Indonesia! I’d especially like to thank Slamet, Sjaeful, and Rishadi (all pictured above) for showing us their beautiful country and making us feel so welcome. There are too many others to thank individually here, but I appreciate all of the hospitality from everyone at LIPI!

I have many more pictures, videos, and stories I could share but please watch this space where I’ll try to post more on a personal blog which I’ll link to from here. Update: for more pictures, please see the post about my trip to Indonesia on my personal blog.

Posted in Uncategorized

Dataverse Lightning Talk at LibrePlanet 2017

On March 26, 2017 I gave an impromptu five minute “lightning” talk at LibrePlanet 2017 at MIT. I was one of perhaps half a dozen people who jumped up and talked about their open source project. I’m glad that is was recorded because I think it turned out ok! What do you think?

Here’s a transcript of the talk:

Hi, my name’s Phil Durbin. I work down the street at Harvard on an open source project called Dataverse. It’s Apache-licensed and I don’t have anything prepared so I’m going to keep it short and just open it to questions. The problem that it solves is that your tax money is going toward research and hopefully the outputs of that research are being put into open access journals, open access articles. But what about the data that’s associated with that research? Dataverse is a platform for hosting research data. In the academic world, you write a paper, you get a DOI for that paper, a digital object identifier, to uniquely identify your paper. But then, if you have some data associated with your paper, what do you do with it? Do you just throw it on your website? Is that website going to be around in 30 years? That’s the problem that we’re trying to solve, having a permanent place for research data.

If you scroll down on this page [ dataverse.org ], you’ll see that we have about twenty or so installations across the world that run our software. We have a conference coming up in June just down the street at Harvard. We have APIs for getting data in and out of Dataverse. We integrate with a number of other academic research-oriented sort of things. You see journals up here. Open Journal Systems is a way to host a journal online. We integrate with them so that authors of papers can deposit seamlessly from Open Journal Systems into our platform. There’s another piece of software, also open source, called Open Science Framework and if you’re using that to manage your research lifecycle you can publish data into Dataverse from there. A new integration is RSpace. It’s more of a lab notebook and so if you have all of your research in a lab notebook like RSpace, then you can publish your data into Dataverse.

Again, I don’t have too much more prepared. I could go on and on but I see a question in the back. [Inaudible.] That’s a good question. Dataverse is the software, right, but really the question you’re asking is, “Is the institution that hosts the Dataverse software going to be around in 30 years?” The plan is for Harvard to be around in 30 years. In the case of Harvard, where I work, we eat our own dogfood. We run this thing in production. It’s hosted by the Harvard Library. It’s hard for me to… I’m just a developer on the project. It’s going to depend on the institution of course, but the software, under the name Dataverse, is about 10 years old, and the trend has only been more and more adoption within Harvard and across the world so I think we’ll be around. I hope so.

Next question. [Inaudible.] Yeah, sure, we are always interested in partnering. We actually have a feature that we call “harvesting” which is based on a protocol called OAI-PMH. It’s really more for discoverability where if someone else installs Dataverse or any other platform that implements this protocol you can harvest the metadata about the dataset between different installations so you can know that data exists elsewhere. We also have some cool data exploration tools so you can run statistical analysis on tabular data. We also have geospatial mapping of datasets as well.

Another question? [Inaudible.] Again, we tend to replicate the metadata, not the actual data itself, but I imagine if you were truly going under, the ship is sinking, you could probably reach out to one of the other twenty installations of Dataverse and say, “Hey, we’re going to go dark, can someone take our data?” It would be more of an arrangement between institutions I think, but there’s a growing community Dataverse so I would think that someone would pick up the slack, so to speak.

I tweeted about the talk at https://twitter.com/philipdurbin/status/… if you would like to discuss it there.

The video can also be found at https://media.libreplanet.org/u/libreplanet/m/lightning-talk-philip-durbin/ and https://www.youtube.com/watch?v=-GUr-cd_OWQ is a direct link to the YouTube video above.

The transcript can also be found at http://wiki.greptilian.com/talks/2017/libreplanet-dataverse-lightning-talk and if you notice a typo in it, please send me a pull request at https://github.com/pdurbin/wiki .

Posted in Uncategorized

DVN 3: Dataverse back in 2013

Please note: This content was originally hosted at http://people.iq.harvard.edu/~pdurbin but that site has gone dark and I wanted to preserve what I wrote in the timeframe between December 2012 and May 2013. I had just begun working as a developer for Dataverse and this was my write up of what Dataverse was back then. I was getting oriented with the features offered, the code, the community, and the ecosystem. Throughout you’ll see references to “DVN” because that’s what we called the Dataverse software back then. It stood for “Dataverse Network” and we called it “DVN 3.” The software has since been rewritten and rebranded as just “Dataverse”.

It’s surprising how many of the links in the post no longer work. We managed to get the domain “dataverse.org” (replacing “thedata.org”) and we did a rewrite, which included some rebranding. Here are updated links:

Ok, on to the old post, last updated in 2013:


Philip Durbin, Software Developer

Philip Durbin
  • open source
  • data
  • collaboration
  • community

I work on The Dataverse Network Project ( http://thedata.org ), an open source web application for sharing, citing, analyzing, and preserving research data.

dvn-logo

If you have research data… you can host it for FREE at http://thedata.harvard.edu 🙂

dataverses

A “dataverse” is simply a container, a place to upload your data: http://en.wikipedia.org/wiki/dataverse

 

viz example

If you have time series data (on recession trends for example) you have your DVN visualize it (as above) by following http://guides.thedata.org/book/data-visualization

DVN can provide descriptive statistics of your data. Here’s the age variable from a census of Utah in 1880:

 

R example

 

On tabular and network data, you can perform statistical analysis by following http://guides.thedata.org/book/subset-and-analysis

http://dvn-demo.iq.harvard.edu is a great place to test out the DVN software. Go ahead and upload some data and play around. 🙂

open-source-initiative-logooctocat

The Dataverse Network (DVN) software is open source. The code is hosted at https://github.com/iqss/dvn and bugs are tracked at http://redmine.hmdc.harvard.edu/projects/dvn

Your institution is welcome to download and set up their own Dataverse Network installation on their own server. If you need help installing your DVN, please email us at support@thedata.org

If you don’t have a server handy, you can try installing a DVN on a virtual machine on your laptop with https://github.com/pdurbin/dvn-vagrant or https://github.com/dvn/dvn-install-demo . Please don’t use this in production. 🙂
java-logopostgresql-logo

If you’d like to contribute code, please see http://devguide.thedata.org

If you’d like to work on bugs that have been assigned to me, please be my guest. 🙂

I tend to work on the business logic:

JSF diagram

(Image from http://blog.xebia.fr/2009/06/03/seam-repenser-larchitecture-des-applications-web/ )

twitter-logo gplus-64

If you’d like to get involved with the DVN community, you can check out our tweets at http://twitter.com/thedataorg or join the mailing list at http://groups.google.com/group/dataverse-community

I started a Google+ page for DVN and I sometimes chat with people in #dvn on Freenode: http://irclog.iq.harvard.edu/dvn

iqss-logo

The Dataverse Network is one of many products developed by The Institute for Quantitative Social Science (IQSS) at Harvard University: http://www.iq.harvard.edu/products

The source for many IQSS projects can be found under https://github.com/iqss but see http://iqss.github.com/github-at-iqss for a more complete list.

HUIT LTS

Both DVN installations at Harvard (the IQSS Dataverse Network and the Harvard-Smithsonian Astronomy Dataverse Network) are ably hosted by Harvard University Information Technology (HUIT) Library Technology Services (LTS): http://library.harvard.edu/project-update-dataverse

From time to time I check http://bugz.hul.harvard.edu/buglist.cgi?product=Dataverse for anything LTS might need from me.

Earth from http://www.flickr.com/photos/donkeyhotey/5679642883/

http://thedata.org lists Dataverse Networks around the world.

git-tree-munch

This web page is written in Markdown and rendered into HTML with Jekyll. The source can be found at https://git.huit.harvard.edu/pdurbin/pdurbiniq

greptilian-logo

My personal website is http://greptilian.com

Posted in Uncategorized

Hello world!

Thanks for visiting! Please see the about page for more about me as well as my personal website at greptilian.com. As of this writing I work at IQSS on Dataverse. I like it there. 🙂

This being the first post and all, I guess I can get a little meta. I created a blog here at blogs.harvard.edu because of the retirement of the hosting I was using previously. Hosting at scholar.harvard.edu was suggested to me but I was turned off by the description on the homepage that read, “OpenScholar@Harvard is a free web site building tool available to faculty, graduate students and visiting scholars at Harvard.” I’m a staff member and I while I appreciate research, scholarship, and science generally, I don’t consider myself a scholar. I don’t really consider myself much of a blogger either but when I googled for “Harvard blogs” I was happy to discover that I had missed the news that blogs.law.harvard.edu had dropped “law” from the URL, becoming this site. I think it’s fantastic that Harvard is offering a blogging platform for anyone with a harvard.edu address. Thanks!

Technically, I still have an old blog at people.fas.harvard.edu/~pdurbin/blog but I haven’t updated it since I switched to my current job in late 2012. Also, there’s a blog for my current project at dataverse.org/blog but I’m not the one who writes those posts. I’m sure I’d be welcome to be a guest blogger there but I like having my own space to have my own voice. For example, I’m considering giving a talk at the Harvard IT Summit some day but I thought that perhaps I would start with a blog post to gauge interest in various topics and reach a wider audience.

Oh, I don’t plan to enable comments on this blog. If you spot a typo or otherwise want to reach me, my contact information is on my about page.

Phew. I think that’s enough meta for now. 🙂

Posted in Uncategorized