It has been 2 to 3 months since I forced myself into using Office 2007. To a certain degree, I put myself in a situation where as much as possible, I'd be like one of those that power users frequently shout at to RTFM. I deprived myself of using Yahoo (look, am no longer using the 'other' search engine) to look for answers. I have successfully avoided pressing F1. And I practically asked no one to provide me the usual "How To's ...". In short .. I am like the everyday user. More like the "clueless" one.

My goal for this is to know if I can be functional and productive in my first 3 days with Office 2007. The reason for subjecting myself into this stress test is to gauge and determine if the new GUI positively catapults MS to uncharted markets or perhaps pulls it down and drag sales slower. The GUI change was drastic. Prior to this "test", my assumption was that, this is MS deisgned GUI ... and I should trust it. Furthermore, since Office's early days ... it has the reputation to make people productive almost instantly. As far as I can recall, I can compose complex document formats within my first fifteen minutes of playing with each new version of Word. I can also fashion out my usual spreadsheets in record breaking time.

Now, it's Office 2007's turn to be put to a test in my personal usability lab. Here is how it went.

Installation
* Open DVDROM
* Inserted Office 2007 Beta 2 DVD
* .. installing (hmmmm fast)
* got few mouse clicks here and there
* Clicked finished

First 15 minutes
* I was in awe
* the GUI looks so cool
* wow
* astig

Succeeding minutes
* hmmmm
* do I have to Goog ... err Yahoo? Oopps ... no  .. nyet!
* I have to rush a quotation by the next day .. I need to use Word 2007 ... I might impress the prospect with some new and cool
* hmmm ... figuring out (huhoooom)
* hmmm perhaps I am just stressed out ... I should just wake up early the next day and Word 2007 would not fail me.
* zzzzzzzz

Day 1
* Coffee smells great
* I have to rush the quotation to be sent in a couple of hours
* ... drowsy despite of coffee
* nuniuninuninuninuninu .. still figuring out how to do it using Word 2007
* time is running out
* so I switched to the other PC and there was Word 2003

Day 2 
* how come I can't figure things out with Word 2007?
* spent the day wondering if all of a sudden I no longer am an able (power?) user

Day 3
* 2nd try
* same results ... perhaps am just distracted to so many things
* Hmmm ... i might give Excel 2007 a try ... and wait ... looks cool and same with Word 2007
* where can i find the ........ er? hmmm
* Wait .. perhaps it is somewhere here ... hmmm (clueless)
* &$%^%$ Angry
* I think i am getting dumber each day as I grow older


If my first few tries ... I almost gave up. The one thing I did that made me say I survived was when I run a PPT authored using 2003. I used Powerpoint 2007 in one of my most unexpected demo to a client that is very hard to please. Whew. The slideshow was real slow .. but I was able to show the last slide unscathed.

Each day over the last 2 months, I was able to learn new things with Office 2007 with occasional cursing here and there. Am still struggling with it. I still cant figure out how Microsoft came to this. Perhaps the problem is with me. Perhaps there is something that is blocking my brain.

If most users out there, the everyday people, the typical problem between the keyboard and the chair are like me now ... MS needs a lot of coaxing to convince users to upgrade. Just for the GUI alone .. this might become the turning point whether MS made a glaring mistake or just had another brilliant competition-killing design idea.

Though Office 2007 boasts of an array of new products in its fold .. having a GUI that can challenge even the technically savvy individuals can spell the difference between boom or bust. I am hoping that the case i presented here was just an exception or a glitch. I am hoping I am wrong with this.

 

 

I have met a lot of programmers already all through the years that I have been doing software development work. Their level of expertise varies. Some are highly academic that will love to engage you on any topic and could argue his way with flair on any technical subject matter. Some are really no-nonsense that can get you your code as agreed upon. Some are so obsessive with optimizing their code to no end. Some are so sharp to spot a bug from a mile away. Some are really like those geeky operatives that can weave code with razor sharp accuracy.

 

But there is one thing I notice in all of them. There is one thing so obviously common in all of them. THEY ARE ALL EXCELLENT TESTERS. As a result, they really deliver as one would expect from them.

 

Point is ...

  • When a programmer does sloppy testing, what do we get? Sloppy code!
  • When a programmer does mediocre testing, what do we get? Mediocre code!
  • When we get sloppy and mediocre code, what do we have? Unhappy users and a sloppy and a mediocre tester.

 

GOOD PROGRAMMERS *** EXCELLENT TESTERS

 

  • When a programmer does excellent testing, what do we get? Bug-free code!
  • When we get bug-free code, what do we have? Happy and contented users!

 

When we have happy and contented users, we can imagine anything good to come our way.

 

Knowing how to test regardless of methodology is a must for every programmer. I always believed that for one to become a good tester, he needs to fully understand a lot of the non-technical aspects surrounding his work as well as the technical. If he fails to grasp and anticipate possible variations and outcomes of his work, he fails to become an excellent tester. If he fails to become an excellent tester .. He is probably a lazy and lousy programmer.

 

I still don’t believe that a programmer can skip testing and pass it on to another person who will do the testing for him. That will just be a waste of time and resources.

Some years ago, we had a major decision to create one super DB for our product, a suite of applications. We though it was the right decision. It was a good decision but we never realized the non-technical implications of having such design.

 

Having a single super DB for multiple inter-operating applications have its advantages, it also has its drawbacks as well. We had some considerations tackled when we decided to go single DB. But the primary reason why we opted for it was that we thought it would be neat to have a single DB. Easy to develop, tightness and highly integrated, less things to manage and secure, and less things to break. We also thought that SQL Server and supporting hardware would mature fast enough in the years to come so having a single multi-gigabyte or terabyte DB would be a neat choice. We prevailed on some considerations like having a single point of contention in terms of DB development. These considerations stood by as our projections ruled that even our biggest clients would only and possibly just be able to use less than a couple of terabyte of data even if they use our system for 25 years. Typical max setup would be like in the 50GB range.

 

For like 2 years, the single DB mindset stood and was able to contain our expectations. However, about a more than a year ago, we decided to break away from the decision to just have a single DB. The reasons? Well, as our number of tables breached 150 … I said to myself that this would be a pain as we are just doing like 5% of what we are supposed to do. But the primary reason was that, since we are just a small ISV that doesn’t have enough resources to utilize, the database designer (myself) became a point of contention, and a potential single point of failure. There are some more reasons but I will probably write about it someday.

 

To fast track, so we now have 2 databases that are related to each other and used by 5 full blown apps. And we are adding more databases as we develop more apps. However, having multiple DBs can be daunting too. Some of our SPs are referencing SPs and tables and views and functions of the other database. This forced us to strictly use the highly static 4-part, fully qualified naming convention of (server.db.schema.object) when referencing DB objects found in other databases.

 

When we implemented this multiple DB approach, SQL Server 2005 was nowhere in sight and SQL Server 2000 only provided 4-part naming and QA's find and replace as possible quick solutions to the problem.

 

So in 2000, to reference a 'remote' db object, you do the following like in the example below:

 

EXEC mylinkedserver.accountingdb.dbo.usp_PrintString 'Hello World'

SELECT * FROM mylinkedserver.accountingdb.dbo.receivables

 

The drawback with this is that, if you happen to have multiple references and referenced object changed, like the server name or database name changed, you are forced to find all those references and update them. This is tedious and error prone. How I wish there is refactoring in SQL Server that goes deep down to this level. 

 

SQL Server 2005 has introduced a solution to this problem. You may be wondering what that Synonyms folder is doing when you expand your DB object in Management Studio. To illustrate what this new feature is all about, follow the script below:

 

USE another_db

GO

CREATE PROC usp_print_string

@param1 AS varchar(100)

AS

PRINT 'printing usp_print_string ' + @param1

Go

 

In the script above, we created the stored procedure usp_print_string inside database another_db. For us to call usp_print_string inside another DB or instance of SQL Server, we can do the following:

 

EXEC another_db.dbo.usp_print_string 'Hello World'

 

If the stored procedure is used more than a hundred times, you will see the same number of code instances all throughout your application. That can be a nightmare if you happen to change for example the database name from another_db to remote_db.

 

Let us see how SYNONYM works. Try the script below:

 

USE my_db1

Go

CREATE SYNONYM my_sp_from_another_db

FOR another_db.dbo.usp_print_string

GO

 

The script just created an entry in SQL Server telling that my_sp_from_another_db is actually the same as another_db.dbo.usp_print_string. Check out the Synonyms folder in Management Studio. As you create synonyms, you will see its definitions under this folder.

 

Let us see it work. Try the scipt below:

 

EXEC my_sp_from_another_db 'hello'

 

The example

 

EXEC my_sp_from_another_db 'hello'

 

is shorter than

 

EXEC another_db.dbo.usp_print_string 'Hello World'.

 

Looks cool to me! :)

 

But let us try another twist. Let us say that you now renamed another_db to remote_db. All the hundred instances of EXEC another_db.dbo.usp_print_string 'Hello World' will throw an exception as each wont be able to find another_db anymore. To solve this, you need to find all the instances and change it to

 

EXEC remote_db.dbo.usp_print_string 'Hello World'.

 

This is definitely NOT COOL!

 

Using SQL Server 2005, you simple do the following:

 

DROP SYNONYM my_sp_from_another_db

GO

CREATE SYNONYM my_sp_from_another_db

FOR remote_db.dbo.usp_print_string

GO

 

You drop the old synonym and replaced it with a new one. Check out and run

EXEC my_sp_from_another_db 'hello' again.

 

This is NEAT!

 

I have to leave it up to you to check out other niceties of SYNONYMS and how you implement them. There are a lot of benefits using this new feature. One that I can think of now is versioning of stored procedures. I'll write on this in my next blog.

During these times when we have the likes of Google, Yahoo, and Ebay, CNN where-in their massive Internet applications and services are run in mammoth data centers filled with thousands of powerful servers ready to process millions of users request and data streaming through mega bandwidth networks, it is but automatic for us developers and database enthusiasts to think of processing data in real-time whenever somebody asks us how we can handle massive amount of data on a very constrained, highly distributed and unreliable environment. Is there such an environment? Let us see.

 

Let us imagine the rural banks in the country. These rural banks are small and can be found in the most remote of baranggays. They are not so technology savvy, and they still use traditional means of communications. Perhaps only a few have huge servers, while the rest only use personal computers. To make it more constrained and limited, let us imagine that these rural banks use varying forms of networking technology, perhaps some use unreliable dial-ups and low-bandwidth lease-lines. To make it even more pitiful, let us imagine that some can only afford to use decades old floppy diskettes to transport data from one branch to another.

 

Here is the big question. Can we come up with an affordable solution that would allow them to experience a secured nationwide bank-to-bank interoperations?

 

Our edge in technology might perhaps convince us that the exercise would be futile as how can we interconnect them when not all of them are connected to a WAN, or we don’t have any assurance of a reliable network connection. We might also wonder how will we integrate some of them that still use diskettes.

 

Yes, absolutely, we can cook up some system for them … but there is one trivial question we need to ask: Do We Always Have To Process Data in Real Time?

 

To some, yes. For me? NO.

 

Anyways, let us not delve too much in what that solution would be like. But let us take a look at an old reliable technology that has made some appearance in some of the latest software offerings out there that can help us solve our rural bank challenge.

 

I am referring to SQL Server 2005's Service Broker. For those that are familiar with MSMQ, you will easily comprehend this new feature of SQL Server. Service Broker is designed around the basic functions of sending and receiving messages (just like an email). In its advance form, messages can be secured, queued and routed to specific destinations regardless of whether the communication channel is online or offline (again, just line an email).

 

Imagine, one rural branch (Quiapo Branch) sends out a message of a money transfer to (Sulo Branch) to notify the Sulo Branch that X amount of money can be released to an authorized recipient. The Quiapo Branch teller just click "Send Message" without knowing that their DSL line just went off. Using Service Broker, the teller won't even notice that their line went off. The Service Broker will just queue and keep the message until the resumption of the DSL line. Once online, Service Broker sends out the message to the intended destination. If the DSL interruption is for about 1 hour, the transaction delay could just be a bit more than an hour.

 

With these, we are giving the rural banks the means to operate just like they were online. They need not be bothered if they are offline. The system will just resume its normal operations automatically. So having unreliable and very slow communication/network lines wont be that of an issue.

 

So who says that we always need to process data in real time?

 

I'll try to find time in coming up with something that will tell more about Service Broker. Perhaps somebody out there would want to :)

 

 

------------------------------

Additional Information: Service Broker can be used in a lot of scenarios. It can even be used to customize complex data replication requirements. It can be used to split up data paths into multiple channels so that messages can be distributed in as many queues as possible (or queued to a file and transferred unto floppy disks), thereby, increasing capacity and chances to still transport data. This also means that Service Broker can be use to scale applications. Service Broker is also an excellent solution if you want to integrate loosely coupled applications.

Lately, I had an overload of information regarding "software architecture". I have been doing software for like a decade or more already and I have seen a bit enough to know what works out there and what are just passing fad. Today, developers, programmers or whatever you want to call them are so enamored to discuss anything about software architecture with vigor and gusto that I have not seen before.

 

In 2000, when I was doing some software architecture work as an external consultant for some really large project, the technical project manager cautioned me not to "over-architect". It occurred to me that probably she had experienced something that made her said that. I saw the objectiveness of that cautious remark and from then on I have instilled to my mind that remark into every design work I had thereafter. Until now, I think I had been a responsible individual everytime I do software architecture work.

 

I recently ask that technical project manager on why the warning. She said something to this effect:

  • The first time you piece together the super design of your mammoth killer-app, you make all the mistakes of a first timer.
  • The second time you piece them all together, you tend to over-compensate on the shortcomings of the first.
  • The third time you piece them together, you now have the confidence of having a near perfect architecture.

Perhaps this was the reason why there is a saying in the software development space that "you can’t make it right until you reach version 3.0". Or was I referring to Microsoft apps of the past?

 

The big question for us right now is that, how many of us managed to have stayed on to complete our projects and reach version 3.0?

 

Pundits have advocated that this could be avoided and have things right the first try, thus, we see everywhere guides to the best patterns and practices in various colors and flavor. And we see everywhere heated debates on almost everything. Scrutinizing each approach and questioning motives why a particular practice is best to implement while others should not be. Some worked, some failed. Some were misapplied, some were near perfect. But there is one thing conclusive … all these are no SILVER BULLET that would solve all our woes in software architecture and the business of software design.

 

What is very noticeable nowadays is that developers tend to believe in almost everything that they've read out there. They are fascinated to the fact that they can technically articulate and converse proficiently on any technical subject given at hand and bask at the feeling of knowing the differences between subjects. However, with these observations, I also happen to notice that these same software architects and designers build their software around the capabilities and limitations of each technology architecture they tend to believe (or perhaps the easiest to them) instead of understanding first the requirements and adopt something that fit with what is needed based on the limits of what they know. Often, they blame the lack of support from the architecture they preferred when they encounter something that is really hard to accomplish.

 

In almost all cases too, software architects and designers seem to spend too much time architecting a solution. Some without regard to known constraints such as time, financial resources, experience or the lack of it for the said solution. Some would tend to apply the complexities of a multi-tiered architecture to even the simplest of apps just so to be known to peers as architecture savvy.

 

So I ask, what do others do? Do they understand first the requirements then find and apply appropriate patterns and practices that can deliver the requirements and expectations of users? Do they try to instead fit the user requirements to that of their preferred architecture? Do they pick several of the best out there and find ways and means to glue things together to come up with something deliverable?

 

As for me, I tend to absorb first the requirements and know my constraints (time, manpower, logistics, technical capability, experience, etc). Then, I tend to immerse myself into what the software be like in its final form. Once I get a clear picture of what the software would be like, that is when I start doing technical work. This involves picking the right architecture to adopt. It is not always necessary for me to check on what is new. If what I have known and tested to work fits given all the scenarios and what-ifs, I pick that architecture regardless how relevant, or how obsolete it has become. Sometimes, I see people who have a knack on choosing whatever is new and untested out there as long as it is popular. It sounds hip anyway. If I find my architecture stock insufficient to the work at hand, then that is the time I start investigating other possible options. This can be really hard sometimes, since in most cases, you don’t have the benefit of experience. If I am not careful and responsible in choosing the right stuff, the consequences could be unimaginable.

 

By and large, I think having too much architecture can be as bad as having none or too little. I also strongly believe that software designers and architects should be on responsibly knowing how much architecture is needed for a particular software and not how to produce the technically most elegant or advanced architecture for any software to prove and show that we are architecture-savvy.

 

Since there is really no silver bullet when it comes to software architecture, and forever we will be avid learners of what is new and cool out there, and, forever we will be adopting to changes as dictated by our users, I am espousing that let us be responsible in selecting and applying what we think is the right architecture for the job at hand.

 

Now I think this is the silver bullet after all: Responsible Software Architecture!

What is the best thing about using LIKE operator in T-SQL? It is very convenient to use in doing some few pattern searching.

 

Many of us are using LIKE just for this reason alone. For example, running the following query

 

SELECT first_name

FROM _name

WHERE first_name LIKE '%electra%'

 

on a several thousand row table would easily give you result in a blink of an eye (assuming that a non-clustered index was created on first_name column).

 

But let us try to go deeper as to how SQL Server executes the LIKE queries:

 

-- Contains Search

SELECT first_name

FROM _name

WHERE first_name LIKE '%electra%'

 

Execution Plan Used = Index Scan

 

-- Ends With, Where Last Character Can Be Any Character

SELECT first_name

FROM _name

WHERE first_name LIKE '%ann_'

 

Execution Plan Used = Index Scan

 

-- Begins With Search

SELECT first_name

FROM _name

WHERE first_name LIKE 'electra%'

 

Execution Plan Used = Index Seek

 

 

We know that every time SQL Server executes a query, its Query Optimizer tries its best to determine the best execution plan that it can do. It will always attempt to use an Index Seek wherever/whenever possible.

 

On the 1st and 2nd example, SQL Server failed to determine or find an appropriate index to use. Thus, it opted to use an Index Scan. In an Index Scan, SQL Server scans all the rows in an index. This means that SQL Server has to scan each row in an index to determine if the row qualifies in your search criteria. If you are searching for just 3 rows within a million row table, and those records are in the 8th, 9th and 10th, SQL Server will have to scan up to the millionth record of the index before it can return all the rows that you need. Now that is quite unacceptable to most of us. However, Index Scan isn't bad at all. There are times where an Index Scan is faster then Index Seek. Every time you see in your execution plan that SQL Server is using Index Scan, take a look at the number of rows it is processing. If it is just a few thousands rows, then you are fine and Index Scan can be really fast.

 

On the 3rd example, SQL Server was able to determine that it can use an index effectively by using Index Seek. Index Seek, as we know means that SQL Server's Query Optimizer was able to find a useful index to locate the records we are searching. If we are searching for 3 records in a million row table, SQL Server will return the rows in a blink of an eye.

 

So how would this information help us?

 

Often, when we develop apps, we tend to use a lot the LIKE operator. During this stage, we seldom find performance problems related to LIKE operator as typically, our tables are not well populated with large amount of data. In most cases, we often have a few hundreds of rows or less in our tables during development stages. Perhaps we are thinking that since we have created an index, then we are ok.

 

We will only notice the performance problems associated with LIKE when the tables gets populated heavily with real data.

 

Hope this helps us in understanding when and how to use the LIKE operator. Hey, you may want to explore SQL Server's Full Text Searching.

 

This case was raised to me by my colleagues in the company. I thought it would be nice to write it on my weblog. This isn’t a technical article but rather a reflection on when I think is the right time to create indexes based from the experiences I have been through every time I design databases and use them. I am not sure if this is a good practice but it has worked for me ever since I got hooked with SQL Server since version 6.5.

 

So when do I start creating indexes? Read on …

 

When I layout my DB objects, I seldom define indexes right away. But of course I make notes as to where and what kind of indexes I'll be creating later on. I let the database drag on for quite sometime during the development stage without my custom indexes. I let myself write queries without the benefit of indexes. I also allow for sometime that the database gets populated with enough data. Most often, I start creating indexes when the system is almost feature complete and most of the complex queries were written and run against some amount of data. We usually call this phase the load testing phase.

 

Why do I do this? Here are my two major reasons:

  • Indexes can improve your query performance drastically. During the query writing phase, having indexes can give you an impression that your scripts are blazingly fast. This can be worsen by the fact that typically, your database contains just a token amount of data.
  • Since we all create optimal indexes based on the kind of queries sql server will often encounter and process, creating indexes too early could potentially give you skewed performance impression.

Both reasons can potentially affect your capability to detect query performance problems.

 

By having the indexes created at a later stage of the development phase, you subject your database to work harder without the benefits of indexes. Performance problems will begin to appear if you have crappy queries. Still, at this point, resist creating an index. Exhaust your means to optimize the query without the benefit of indexes. If you want to create indexes at this point for the sake of testing, just be sure to delete them after the testing.

 

Doing this will give you a certain level of assurance that you are writing optimized queries without the benefit of indexes. You will even appreciate this practice when there are too many in your team writing so many queries. Once you have the indexes, your queries will be lightning fast.

 

So how about you? When do you create your indexes?

It's been a while since I have posted something here. I was able to got some time to check out again some features of SQL Server which I have been eager to explore. I have been waiting for this eversince I experimented on it in version 2000.

Ever heard of thesaurus-based searching in SQL Server? This feature has been present since version 2000. However, it was an undocumented feature which I learned from Hilary Cotter. Google this guy and he has a lot of things to say about SQL Server. But he always remind me everytime I do fulltext stuff with SQL Server. He is the go to guy when it comes to fulltext searching. I hope he still uses the bibliographic data I gave him few years ago.

So what is this feature? For example, if you want to search the word "color", records having the words synonymous to "color" such as "hue" and "tint" can be returned as well. The feature also allows you to expand your search keys. For example, you are searching for "100", it can return records containing the words "100", "one hundred" or "isang daan" (100 in Filipino).

How can this be done?

First check out the files in \Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\FTData. The folder contains XML-based thesaurus definition files. SQL Server uses one of these files depending on the language setting you have. For English, check out tsENU.xml. In the file, you may add the following entries:

<replacement>
<pat>color</pat>
<sub>hue</sub>
</replacement>

<replacement>
<pat>hue</pat>
<sub>color</sub>
</replacement>

<expansion>
<sub>100</sub>
<sub>one hundred</sub>
<sub>isang daan</sub>
</expansion>


Then, enable your database for fulltext. I'd leave this for you to explore. You just need a few mouse clicks to enable your DB to fulltext searching.

The following queries will somehow give you an idea on how to use this feature. You may try and modify the scripts for you to use in your applications.

SELECT title FROM book
WHERE CONTAINS(title, ' FORMSOF (THESAURUS, "hue") ')


This query will return records having the words "color" and "hue".

SELECT title FROM book
WHERE CONTAINS(title, ' FORMSOF (THESAURUS, "one hundred") ')


This query will return records having the word "100", "one hundred", "isang daan".

SELECT title FROM book
WHERE CONTAINS(title, ' FORMSOF (THESAURUS, "isang daan") ')


This query will return records having the word "100", "one hundred", "isang daan".

However, SQL Server search is limited to what is defined in your XML-based thesaurus definition files. As far as I know, Microsoft has not provided a fully functional thesaurus definition file. MS left us only with templates (Don't forget to remove the comments in the files). It is up for us to define words we want included in searching. Yeah Yeah ... quite a lot of linguistic work for us techies. Let us hope MS be kind enough to include a fully functional thesaurus definition file next time.

This is all folks. I hope this is enough to give you an idea and start implementing the feature in the apps that you develop.

One colleague of mine raised some intriguing points on why SQL Server's User Defined Datatypes can't be edited. You do the editing process by dropping the datatype and recreate the new one. His position was that, he avoided using UDDs simply because of this limited mechanisms. This has led me to think of another question.

The question really is .. how often do we need to change the definition of our data? In its lifetime, I should say it should not be changed at all once our understanding of its nature is final. If the data definition we had changes from time to time, I would say that there is a possible problem in our foresight or understanding of the nature of that data entity. If that is the case, it should not be a good reason for us to define a UDD for that data entity. Worse, If we are not sure of a data entity’s definition, we have to get back to the drawing board and try to understand its nature once again if indeed we are sure of its definition. This problem also reflects the lack of resolve when determining the true nature of certain data entities.

However, I do recognize, that by the time we really need to change the definition of our data entity, I would surmise that the reason for doing so is very significant and justifiable as it will have a great impact in both our database and our application. This issue brings to us the question on how clear are pictures in our minds.

Here is my case with a few examples to show how I deal with the situations when understanding and defining data types.

Scenario #1: How stable can we define a data type?

BirthDate as DATETIME
EmailAddress as NVARCHAR(320)

Scenario #2: Questionable Candidates for UDD Creation

CustomerID as CHAR(5)

Scenario #3: The Lack of Understanding Example

Age as INTEGER – (assuming in Years)

Scenario #4: Things Change Over Time

PhoneNumber as CHAR(7)

The scenarios above are the usual things I encounter when one defines the type of a certain data.

In scenario #1, how often do we need to change datatypes? Should we have these kinds of data and have clear understanding of their very nature, defining UDDs for these would be sound. Thus, EXEC sp_addtype BirthDateType, datetime, 'NULL' would almost guarantee us that we will NOT encounter problems in the future or would require us to edit its definition.

Scenario #2 is a classic example of our lack of foresight or anticipation. A good DB designer could easily spot this. In the example, it is just a matter of time before we make it to CHAR(10) or CHAR(15) or better yet CHAR(25). I personally declare IDs as VARCHAR(255) as I don’t have a habit of making ID types as one my system keys. I’d rather have IDs as VARCHAR(255) and indexed them with UNIQUE constraint making them candidates for user consumable keys. As for me, I won't be inclined to have a UDD for this situation.

Scenario #3 is also a classic on how we fail to understand the very nature of certain data. In most of us, without batting an eyelash, it is almost automatic that we declare Age (in years) as INTEGER, only to find out later on that we need to store it like 10.25 for 10 Years and 3 Months.

Scenario #4 forces us to change. It has been decades that Philippines had 6 or 7 digit local phone numbers. We could have probably saved on a few bytes and declared it to CHAR(7). Now we are seeing 8 digit phone numbers and we need to adapt.

If we had defined UDDs, for all the samples above, scenarios #3 and #4 could probably force us to EDIT our UDDs. Using EXEC sp_addtype PhoneNumberType, CHAR(7), 'NULL', we have defined a UDD for #4. To drop and create a new UDD for this to support CHAR(8), this could create a major headache in our app. We will need to revisit our SPs and character validation codes, the formats and pics we defined, the CONSTRAINTS we declared and other stuff on both our DB and the apps and check all support related to PhoneNumberType. How do I avoid this? Should I need support CHAR(8), I would definitely NOT do an “EDIT” by dropping and recreating my UDDs. Instead, I’ll create a new one with

EXEC sp_addtype PhoneNumberType2, CHAR(8), 'NULL'

and subsequently create new ‘phone’ type columns and define them as PhoneNumberType2. Eventually, I will have to deprecate PhoneNumberType in favor of PhoneNumberType2 but that would be another story.

This way we have zero impact on our apps and databases and have avoided possible headaches.