Wednesday, April 23, 2014

Big Test Data & JUnit

A good old session of writing-to-clarify coming up.

I have a pretty difficult challenge on my hands. Well, difficult as in non-trivial, non-standard, etcetera. Which is what makes it worth writing about.

I am importing external data into a database. Most of the difficulties are beside this blog post. The subject difficulty is the testing: testing with real data. You can of course do unit tests, and I have so done, that test the details of the formats.

Now, the real data to import is big enough that you don't want to just stick it in a repo. So what can you do instead?

One thing I've done is this: download the data from the source over the internet. This takes some time the first time on a machine, but when the data is downloaded and cached as a file, it's instantly available.

Downloading is also a good test in itself, since the solution will have to automatically check for updated data now and then at a bunch of URLs.

Another thing is to try to cut the data down in size. But that is not so attractive; it's possible to make a mistake; at least in the formats that are not really simple like CSV.

So; moving ahead with the downloading solution, I need to do tests that are certainly not unit tests. JUnit is used, so it'd be nice to make them JUnit tests. This is where the 'non-standard' thing comes in. There seems to be no standard way to mark tests as manual-only in JUnit. The one way I know is to put excludes in maven's surefire. That's kind of clumsy; this should really be part of the actual source code.

So, I cooked up another solution recently, the one posted as 'Spring Plain Java Test'. But this will not be part of the JUnit tests, so as a UI for other developers it's not so good; you'd want them to be able to see these tests as ...tests. And the standard way to do that is to make them JUnit tests so they show up in their GUIs as tests. 

I have no good (practical, standard) solution to all of this.

This is related to a concept of readability, find-ability of projects that have been thinking about a lot, and written some drafts for blog posts, but they are in the cooler for now. Suffice it to say that I think it is a poor bunch of tools, practices and standards that we usually work with and within. 


No comments:

Post a Comment