Software Design Case Study - Input & Output FormatterThe ProblemThis was a sub-project in a larger application. The project had identified a collection of data as a "worklist", which contained information about sample tubes. Each worklist had a name and a number of items in the worklist. Each worklist contained a number of racks, and each rack contained a number of tubes. The problem was that the physical representation on disk of the worklist had changed over time, yet the program had to be able to read any of the older formats. A newer format had also added error detection by inserting a cyclic redundancy check (CRC) at the end of the file. Yet, despite the file format, the information itself was the same, therefore the application design called for a polymorphic Input and Output Formatter, to perform the translation into and out of the physical file format. The DesignDuring design, it was apparent that the printer was just another output format, so the printer was added as another output format. This allowed the same application code to be used to send the worklist to the printer. First, the information that changed (besides the worklist data itself) was assessed:
To handle the varying numbers and types of error conditions, a Generic Error Object was used. See the Error Object case study. Since worklist formatting errors were only one of a number of classes of errors, it had its own error object. All files were ASCII text, so several file functions were common to all formats. Those functions that could or would be common were collected into a base class. There was no intention to ever instantiate an object of this base class, so the constructor was made Protected. Interface Definition The interface was defined first, with all functions being stubbed for incremental implementation:
class CFormat void LogError(ERRORTYPE
dError, LPCSTR szErrorText); This base class provided all the common functions, as well as controlled access to the error object. To provide a more convenient method of logging errors, the LogError() and LogFatalError() functions were added. The logic flow of information was different for input and output operations, but was consistent for all input operations and consistent for all output operations. So the CFormat class was used to derive two other base classes: CInputFormat and COutputFormat. Their functions were pure virtual, allowing polymorphism for the specific input and output formatters. Again, because these classes were not be be instantiated, their constructors were made protected.
class CInputFormat : public CFormat
// Class builder to create
the specific formatter. protected: class COutputFormat : public
CFormat // Output Format only methods
- replaces use of switch{} // Class builder to create
the specific formatter protected: The various Output Format Only methods represented the format-specific information that the main application had to know. In a previous application, switch{} statements had been used each time this information was needed. However, in this design, the output formatter would be requested to provide the information. For some formats the information was irrelevant (such as filenames when the output format is the printer). In those cases, a NULL value would be returned. This design allowed more formats to be added without impacting the rest of the code. However, now that switch{} statements were no longer used, the default case in the switch{} had to be handled. This was done by providing a Null format. It was an formatter that did nothing - its functions were just stubs. It provided the same functionality as the "default" had provided in the previous switch{} statements. Creating separate input and output format base classes accommodated the different logic flow between reading and writing. When writing, the application already knew the worklist size, so the logic was designed as follows:
OpenWorklist(Worklist) However, when reading the application started with an empty worklist and had no way of knowing for certain how many racks and tubes there were until the entire worklist had been read. The logic was designed as follows:
Worklist = OpenWorklist The formatters themselves were responsible for logging their errors to the error object. The error object provided a LogError function if the error just had to be noted, but reading could still continue. The error object also provided a LogFatalError function that would cause the formatter's function to return a False or NULL value, which were used to break out of the FOR or WHILE loops. For brevity, the error logic is not shown in the pseudo-code above, but it followed my general Don't Try - DO precept. To Instantiate a Formatter The WLTYPE data type was added to define the logical worklist format. It was an enum and was passed to the formatter so that it could instantiate that formatter. I decided that the CInputFormat and COutputFormat base classes were all that the main application should ever know about. I did NOT want dependencies on the header files for the specific derived subclasses - that was information that the main application had no use for. I gave each base class a static function to get the specific formatter. The application would access the formatters as follows: CInputFormat* pInput = CInputFormat::GetInputFormatter( wlType ); COutputFormat* pOutput = COutputFormat::GetOutputFormatter( wlType ); Obviously, the virtual base class cannot instantiate the subclasses - that would require the header files for each subclass, creating a circular reference. However, the application did not have to know HOW the formatter created the format object. In fact, using this design, the main application was never aware that the formatter was polymorphic at all - which is just as it should be. The effect is that the base class is asked to polymorph itself. Object creation was delegated to a class factory - a separate object that CInputFormat and COutputFormat used to create the class. Since only one class factory should exist, it was created as a Singleton - using the static Instance() method to obtain a pointer to the class factory. In addition, each of the specific formatters would be Singletons, so the class factory held pointers to each of the specific objects; after all, there was really no reason for a formatter to be created more than once. Since the main application should have no dependency on the class factory, the header files for CInputFormat and COutputFormat contained no references to the factory. The factory was called by CInputFormat in this fashion:
//----------------------------------------------- // By typecasting the
subclass as the base class,
//----------------------------------------------- protected: private: // Instances of the specific
formatters (each one is a Singleton): The Singleton The book Design Patterns does a better job of explaining this pattern that I will, but basically the idea is to ensure that only a single instance of an object is ever created. That is done by delegating object creation to the object itself. Since the constructor is protected, the object cannot be instantiated by declaring a variable or by using "new" - a compiler error would result.
//-----------------------------------------------
//-----------------------------------------------
//----------------------------------------------- NOTE: this Singleton code is not thread safe. This application was not multi-threaded, but if it were, it would have to be made thread safe by enclosing the Instance() code inside a critical section. Otherwise, a context switch in mid-if could result in duplicate objects being created, with one pointer ending up lost when the m_hInstance pointer created by the second thread is overwritten after the first thread regains control. UML Design The static structure for the output formats is as follows (CPhysicalAspect was the client object that handled the worklist data). The structure for the input formats is similar, and are also derived from CFormat and also use CFormatBuilder as the class factory. CFormatBuilder was the only code that relied on switch{WLTYPE wlType} to decide which object to instantiate, based on the enum WLTYPE. If that format was not applicable, then a NullFormatter was returned. For example, if asked to create the input formatter for the Printer - the NullFormatter would be returned.
The ImplementationThe complete source code for the formatters is of little use, since the implementation of the various formats is of no value in this lesson. Click here for the source code for CFormat test shell, which has a useful CRC calculation algorithm as well as an implementation of the Generic Error Object. This is the same source code as for Case Study #1. It allows the user to type some text and calculate the CRC. It also allows the user to set error flags, which are displayed when the CRC is calculated. The purpose of the shell was to perform integration of the CFormat object and the Generic Error object. Unit TestingUnit testing began by creating the Null formatter and verifying that the main application logic ran without crashing. Then the simplest output formatter was created and a worklist written in that format. One format at a time, first the output and then the input, the formatters were created and tested. The only impact that adding formatters had to the main application logic was to require adding a user interface method to select the new format. |
|