
By PG Bartlett | Article Rating: |
|
October 29, 2003 12:00 AM EST | Reads: |
15,976 |
A successful XML publishing project inspired this article. The project's leader, who claims that the financial return gained for his company "made his career" there, achieved success for two reasons: he focused on the right goals and executed the project in the right way.
This article focuses on two things: how to establish the right goals for an XMLbased publishing project and the most common mistakes made. We explore the topic by discussing how to go about it the wrong way.
Mistake #1: Plan too little
Everyone knows the importance of
upfront planning, right? Yet, even
though "everyone knows," we regularly
see projects marred by inadequate and
superficial planning.
Why does this happen? Two common reasons emerge. First, most people responsible for planning grew up with word-processing and desktop-publishing software. As a result, they typically think that implementing an XML-based system primarily involves a substitution of technologies and file formats.
In reality, using XML for publishing involves new and unfamiliar concepts - it's a true paradigm shift. Unless someone with XML publishing experience helps with the planning, you will likely invest too little in the upfront work.
Second, the decision to launch an XML publishing project can take too long (doesn't it always?). But because the deadline doesn't change, planning gets squeezed to leave more time for implementing the wrong thing. Dilbert cartoons routinely illustrate this problem quite effectively.
Complicating this problem, it's also possible to go overboard on planning. This occurs much less often, but it's still costly because it delays the realization of benefits. Six to eight weeks for planning is about right. If that's not sufficient, then you're probably making mistake #2.
Mistake #2: Try to do too much at once
Once bitten by the XML publishing
bug, it's easy to identify opportunities
for dramatic improvement everywhere
in your organization. So much waste! So
much redundancy! So much inaccuracy!
How could we have been so blind?
But you must resist trying to change everything at once. Too many people, too many processes, and too many document types exist to tackle everything at once. Instead, start with one group, one process, and one set of related document types.
Some words of caution: make sure you take the long view when planning so that phase VII of your project works well with phase I. You don't want every phase to require going back and changing previously completed phases.
Mistake #3: Try to change too little
Here's a surefire way to fail: start with
the aim of creating "minimum disruption."
Sounds good - won't work. You
want to leave the same tools and
processes in place and get a different
result? You don't want to affect anyone
or change anything but you want to
achieve great benefits?
No magic beans exist. If you want to achieve dramatic results, expect to make dramatic changes. Since people naturally resist change, you will need to sell them on the organizational and individual benefits of the changes.
Mistake #4: Try to automatically convert all existing content to XML
Here's one of the most dangerous
misunderstandings in publishing: existing
processes and tools produce information
that is sufficiently consistent to
allow automatic conversion to XML. No
matter how many times we have
encountered that belief - and no matter
how insistently it is expressed - it is
always wrong.
Word-processing and desktop-publishing tools survive precisely because of the flexibility and freedom they provide to authors. These product attributes are opposed diametrically to the primary purpose of creating XML content, which involves constraining the author to create content according to a set of rules.
Is it hopeless to convert existing content to XML? Not at all. Tools are available that can convert existing content to XML. But you must accept that manual cleanup will be required, so design your process accordingly.
If you're contemplating a one-time conversion of existing information to XML, that's a subject for another article. In this article, we're focusing on building a new system that uses ongoing conversions from word processors.
In such cases, for simple documents or simple content, the manual cleanup may be minimal and, therefore, reasonable. But for long, complex documents, the cleanup cost may be excessive.
You should carefully avoid presenting a cost justification for your system that depends on ongoing, fully automatic conversion of long, complex information to XML.
Mistake #5: Try to convert word-processing tools to XML editors
We have seen companies waste millions
of dollars building applications on
top of word processors in an attempt to
force authors to conform consistently to
a set of rules. Why? Because the tools do
not provide the architecture that
absolute conformance to a data model
requires.
Fortunately, word processors and desktop-publishing software are becoming increasingly XML-aware and a few are even XML-capable. These tools offer a greater chance of success, especially if you arm yourself with expert assistance to dissect vendors' claims.
We'll explore this topic in greater detail in a future article.
Mistake #6: Set up too many rules
We're referring to the data model - the
DTD or schema - that guides the author
in creating and editing content. Two
dimensions exist to the problems of "too
many rules." First, the data model is too
restrictive, and second, the data model
has too many tags.
Many novices begin by designing highly restrictive data models with lots of tags. Such data models involve too many subsequent changes, which cost time and money, and require authors to spend a long time learning them.
To make a model overly restrictive, you would be very careful about limiting where tags can be used and how they can be used. For example, you may decide that a <part number> tag can appear only in a <paragraph> tag. But later you may realize that you have to allow a <title> tag to contain <part number> as well. And then you'll find still more places where you need to be able to use <part number>.
To create a problem of too many tags, give authors somewhere between 200 and 300 tags to learn so that they reach their maximum productivity just about the time that they move on to another job. If you want an overly broad generalization, shoot for 30 tags.
Mistake #7: Use too many moving parts
The problem with too many moving
parts is that you must do a lot of work to
choose them, integrate them, test them,
and keep them all working.
In traditional publishing processes involving a lot of manual work, a problem usually doesn't erupt. Many moving parts may exist but human intervention integrates them and keeps the whole machine working. For example, contributing authors may use word processors while the technical publications department uses desktop-publishing software and manually imports the word-processor files as needed.
In an XML publishing system, however, one of the goals is to eliminate human intervention and make everything work together automatically. Fulfilling this goal requires tight integration among the various software products.
XML publishing systems must also deliver more functionality and productivity than the traditional systems they replace, so a key project requirement usually includes the execution of a content management system as well.
No single vendor offers a complete system that delivers all of the functionality needed in support of every type of content. That leaves customers with the task of selecting vendors for each piece of functionality needed.
The short answer is to limit the number of vendors involved - choose enough to accomplish your goals (both immediate and future!) but no more. The long answer is to get some expert assistance to help you match your current and future needs with the products available.
Published October 29, 2003 Reads 15,976
Copyright © 2003 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By PG Bartlett
PG Bartlett is vice president of product marketing at Arbortext, where he is responsible for corporate positioning, marketing strategy, and product direction. Bartlett joined Arbortext in 1994, bringing more than 18 years of experience in both technical and marketing positions at leading-edge high technology companies. He is a frequent presenter at major industry events and has been invited to speak and chair sessions at Comdex, Seybold Seminars, XML conferences, AIIM conferences, and others.
![]() |
mukhtar 12/19/03 11:34:12 PM EST | |||
![]() |
James Fuller 11/19/03 06:51:24 PM EST | |||
A few ruminations on your article; Planning instead of building is an age old concept in software ( as well as buildings ), which with every passing month seems to be reiterated in one passing methodology fad or another. Most of the points you raise are generally applicable to 'all things' software. I would respectfully point out that there are a few other, possibly more important issues when designing with XML. I will list some further alternate ways of messing up with xml; - not recognizing the differences in relational vs hiearchical data; for 20+ years RDBMS have been king.... - not identifying document centric vs data centric data in one's usage of xml - XML should be human readable, the moment it becomes opaque to human inspection....the moment it becomes hard to debug/read/see if its correctly doing its job - dont be afraid to cook your own xml vocabulary, but always look around to see if someone else has done it before you. We see too many people replicating effort, where enhancing an existing xml vocabulary is much less effort - just because you like XML, don't force a declaritive processing model on all your publishing processes, sometimes its easier to just pass a filter through all of your data using classic parser techniques; hybrid approaches tend to be more successful then 'golden hammer' - dont force XML on domain experts, if they are comfortable with existing methods, then just take their output and xml'ify it at the end of the publishing workflow - recognize that the biggest impact of XML is Unicode, Ubiqitous usage, and the sheer utility of an easily understandable short term data format - early taxonomisation of xml is a pitfall, there is little need to initially absolutely define a vocabulary with all the expressive power of XML Schema. - Publishing can reflect pipelines of processing, take a look at existing XML Application servers...I see many people replicating functionality where Cocoon, AxKit, or Ant maybe appropriate. and lastly use xml:lang. regards, |
![]() Nov. 30, 2017 04:15 PM EST Reads: 624 |
By Roger Strukhoff ![]() Nov. 30, 2017 04:00 PM EST Reads: 3,497 |
By Elizabeth White ![]() Nov. 30, 2017 03:15 PM EST Reads: 318 |
By Liz McMillan ![]() Nov. 30, 2017 02:30 PM EST Reads: 695 |
By Liz McMillan ![]() Nov. 30, 2017 02:15 PM EST Reads: 619 |
By Liz McMillan ![]() Nov. 30, 2017 10:45 AM EST Reads: 1,072 |
By Pat Romanski ![]() Nov. 30, 2017 10:15 AM EST Reads: 1,894 |
By Liz McMillan ![]() Nov. 30, 2017 10:15 AM EST Reads: 1,179 |
By Liz McMillan ![]() Nov. 30, 2017 09:45 AM EST Reads: 845 |
By Elizabeth White ![]() Nov. 30, 2017 08:15 AM EST Reads: 941 |
By Liz McMillan ![]() Nov. 30, 2017 05:15 AM EST Reads: 3,127 |
By Pat Romanski ![]() Nov. 29, 2017 10:00 PM EST Reads: 1,788 |
By Liz McMillan ![]() Nov. 29, 2017 04:30 PM EST Reads: 9,118 |
By Liz McMillan ![]() Nov. 29, 2017 11:00 AM EST Reads: 1,248 |
By Liz McMillan ![]() Nov. 29, 2017 07:30 AM EST Reads: 1,009 |
By Liz McMillan ![]() Nov. 28, 2017 12:00 PM EST Reads: 2,173 |
By Liz McMillan ![]() Nov. 26, 2017 12:00 PM EST Reads: 1,558 |
By Elizabeth White ![]() Nov. 26, 2017 10:00 AM EST Reads: 1,496 |
By Pat Romanski ![]() Nov. 24, 2017 05:00 PM EST Reads: 1,569 |
By Elizabeth White ![]() Nov. 24, 2017 10:00 AM EST Reads: 1,260 |