Big data of all sorts is a good skill to have. There are many related skills: relational databases, nosql databases, data mining, natural language processing… Healthy control of each of this skills means good salary and many job openings.
Personally I worked with several methodologies rather briefly, and while I have some skills and practice in mysql, mongodb, solr and several other related technologies, I am no guru in any of them. While I do not feel sufficiently confident to describe the optimal approach, I describe below what I do when faced with database-related challenges.
Typically the first thing I do is to learn the details of the business. This means creating a lot of mental markers for each business activity and details of that activity. Usually I need to either create a full use-case scenario from scratch using reference sites, or to interview the stakeholders and get understanding of their perspectives. Either way I need to speedread a lot of specifications and similar designs. I use mental colouring to represent various perspectives of the process: customer retention, sales support, product tree, the inner structure of each product, GUI etc.
Very often I do not start from scratch and I need to visualize existing data structures as yet another perspective. Quite often it challenges my understanding, and either the understanding changes or the bugs get fixed. Imagining what goes on in a head of another programmer is very hard. Fortunately most of us follow very few conventions, so if I understand the conventions followed by the original designers (and their perspectives), the integration is easier.
Once I have all the perspectives in my head I start merging them into a coherent form. Operations on markers are a hard skill to learn. Mindmaps are easy to manipulate. Working with loci may feel very much like city planning, so I would not use it here. PAO memorization is very natural in this context, since it is easy to imagine a specific stakeholder doing a specific task. The basic manipulations include linking or merging between several objects, chunking, breaking complex markers into simpler markers and adding details to existing markers. I think that it is hard to do all the work in one’s head and I use Visio/Powerpoint/UML software to visualize the markers I work with. Occasionally I build a simpler scaled-down model of the system to facilitate understanding of all the processes needed to be handled.
When ready, I write down all the required data (or some of it) in abstract form of tables or hierarchies. If I have the business data in my head as markers the task is very simple. Each marker becomes a table, and each detail of the marker becomes a column, the links form indexes and references, and the instances on the marker in different contexts form rows. A similar method works for XMLs, but the structure may be more complex: XML is perfect for complex forms of chunking. At this point I can either use a generic database visualization/design or start actually building the database.
Next I design queries. Since I processed several business scenarios to generate the markers, I traverse the same business scenarios to retrieve the details from the markers. PAO memorization is very well suited for revising use case scenarios and getting the data, since I can easily visualize a stakeholder operating the system to get the data, looking in his papers, making phone calls to the clients etc.
The database optimization comes next. Usually the process is very technical: I go through the most important use cases and try to imagine amount of processing required, e.g. the computer perspective of each use case. Then I try to reduce the load by using indexes, caching, and more complex methods. I imagine what can go wrong and plan transactions and rollback operations, hooks, scripts etc. Finally, I write GUI to activate all the system and test the scenarios in real life.
Now, I think most of the architects, programmers and database designers work in a very similar way. Superlearning enables fast collection of information, easier visualization of data and use cases, learning of specific abilities and tweaks of the database system I work with, facilitates attention to details in links and indices… In technical skills nothing can trump years of experience, but proper methodology may reduce the gap….