Implementing Teradata Utilities in A Powercenter Mapping
Implementing Teradata Utilities in A Powercenter Mapping
PowerCenter Mapping
© 2013 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means
(electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. All other company and
product names may be trade names or trademarks of their respective owners and/or copyrighted materials of such
owners.
Abstract
Teradata provides utilities, like FastLoad, FastExport, TPump, and MultiLoad, that enable you to
quickly load data into a Teradata database or export data from a Teradata database to a client
application. This article describes some Teradata utilities and provides an example of how to use
them in a PowerCenter mapping.
Supported Versions
PowerCenter 8.x - 9.5.x
Table of Contents
Abstract ............................................................................................................................................ 2
Supported Versions ......................................................................................................................... 2
Table of Contents ............................................................................................................................ 2
Overview .......................................................................................................................................... 2
Teradata Utilities .............................................................................................................................. 3
FastLoad ...................................................................................................................................... 3
FastExport .................................................................................................................................... 4
TPump.......................................................................................................................................... 4
MultiLoad ...................................................................................................................................... 4
Example: Using TPump and FastLoad in a PowerCenter Mapping ................................................ 4
Script that Determines the Loader Type ...................................................................................... 5
Session Properties ....................................................................................................................... 6
Step 1. Specify the Path and Name of the Parameter File ...................................................... 6
Step 2. Specify the Directory and Name of the Source Flat File .............................................. 6
Step 3. Set the Loader Connection Properties......................................................................... 7
Data Load and Usage of the Loader Log File .............................................................................. 7
Sample Loader Log File ........................................................................................................... 7
Sample Error Table .................................................................................................................. 8
Conclusion ....................................................................................................................................... 8
LoaderConnection_Script.txt ........................................................................................................... 8
Presession_Cmd_Script.txt ........................................................................................................... 11
Overview
Implementing Teradata utilities, like TPump and FastLoad, in some ETL tools can be challenging.
This document describes how to implement Teradata utilities in a PowerCenter mapping.
Informatica supports various enterprise-wide data integration and data quality solutions, including
data warehousing, data migration, data consolidation, data synchronization, data governance,
master data management, and cross-enterprise data integration.
Teradata is a parallel processing system that runs on a share-nothing architecture. The Teradata
database management system scales linearly and predictably based on all dimensions of a
2
database system workload, such as data volume, breadth, number of users, and complexity of
queries.
Teradata acts as a single data store that can accept a large number of concurrent requests from
multiple client applications.
Teradata includes the following key features:
Unconditional parallelism where several servers share the load distribution.
Complex ad hoc queries with up to 64 joins.
Parallel efficiency such that the effort for creating 100 records is same as that for creating
100,000 records.
Scalability where performance scales linearly when you increase the number of processors.
No performance degradation when you increase the number of users.
Teradata Utilities
Teradata offers utilities that support the management and maintenance of a data warehouse on a
Teradata database.
Some Teradata utilities include:
FastLoad
FastExport
TPump
MultiLoad
FastLoad
FastLoad is a parallel-load utility that is designed to move large volumes of data from data
sources on channel-attached and network-attached clients to empty tables in a Teradata
database.
FastLoad typically provides better performance than a standard application program that loads
data to an empty database because data allocation, data conversion, data movement, and data
loading run automatically and in parallel.
Features:
Performs the initial table load.
Supports delimited or fixed-width flat file input.
Processes large data volumes.
Constraints:
The target table must be empty before the load.
No read or write access is allowed on the target table while the load occurs.
Each job can load only one table.
3
FastExport
FastExport is a command-driven utility that uses multiple sessions to quickly transfer large
amounts of data from Teradata tables and views to a client application. You can export data from
any table or view on which you have the SELECT access privilege.
You can export the data to the following destinations:
A file on your channel-attached or network-attached client system
An Output Modification (OUTMOD) routine that you wrote to select, validate, and preprocess
the exported data
TPump
TPump is a highly parallel utility designed to continuously move smaller data volumes from data
sources into Teradata tables without locking the affected tables. TPump enables you to load near
real-time data into the data warehouse, which enables you to make decisions based on current,
accurate data. You can use TPump to insert, update, upsert, and delete data in a Teradata
database.
Features:
Performs fast, scalable, continuous data loads.
Can load data into the following types of target tables:
- Targets that have secondary indexes, referential integrity, and constraints.
- Targets that are empty or that contain data.
MultiLoad
MultiLoad is a command-driven parallel load utility for high-volume batch maintenance on multiple
tables and views in a Teradata database. It is specially designed for high-speed batch creation
and maintenance of large databases. It supports up to five populated tables and performs block-
level operations on populated tables.
4
The following figure shows the PowerCenter mapping that contains the flat file source and the
Teradata target:
Parameter Description
5
$PMRootDir/BWParam/svc/ISP_NATIVE/wf_SVC_ISP_NATIVE_LRF_INC.parm is the directory
and file path of the parameter file. The parameter file dynamically stores the loader connection for
the s_m_ISP_NATIVE_S_ASSET_LRF_INC session based on the cr_TD_conn_type.ksh loader
connection script. The session s_m_ISP_NATIVE_S_ASSET_LRF_INC calls the parameter file to
get the loader connection using the loader connection variable
$LoaderConnection_s_m_ISP_NATIVE_S_ASSET_LRF_INC.
TPUMP_SERVICE_ISP_ETL_01 and FLOAD_SERVICE_ISP_ETL_01 are the TPump and Fload
connections, respectively. The cr_TD_conn_type.ksh script sets the loader connection for the
s_m_ISP_NATIVE_S_ASSET_LRF_INC session in the
$LoaderConnection_s_m_ISP_NATIVE_S_ASSET_LRF_INC variable of the
wf_SVC_ISP_NATIVE_LRF_INC.parm parameter file.
$PMRootDir/TgtFiles/svc/ISP_NATIVE/lrf_S_ASSET.dat is the directory and file path of the flat
file.
Session Properties
Perform the following steps to set the session properties.
Step 2. Specify the Directory and Name of the Source Flat File
Specify the directory and name of the source flat file in the source attribute properties of the
Mapping tab.
6
Step 3. Set the Loader Connection Properties
In the target connection properties, set the type to ‘Loader’ and value to
‘LoaderConnection_{Session name}.’ The loader connection value is fetched from the parameter
file. The parameter file stores the connection value dynamically using a shell script.
7
The name of the error table is ERROR_T.ET_S_ASSET0.
Conclusion
This document described some Teradata utilities, described how to implement the TPump and
FastLoad Teradata load utilities in a PowerCenter mapping, and described how to use Loader log
file and error tables.
LoaderConnection_Script.txt
tdate=`date '+%Y-%m-%d %T'`
LOAD_LIMIT=50000
LOG_FILE=$PMRootDir/TgtFiles/svc/ISP_NATIVE/conn_type.log$$
if [ $# -ne 5 ]
then
echo "Usage: cr_TD_conn_type.ksh <SESSION_NAME> <PARAM_FILE_WITH_PATH>
<LRF_NAME_WITH_PATH> <TPUMP_CONNECTION> <FLOAD_CONNECTION>"
8
echo "Give Complete Path for PARAM_FILE and Lrf"
echo "Example: \$PMRootDir/ScriptFiles/dataload/cr_TD_conn_type.ksh
s_m_LRF_to_TD_fcst_plan
\$PMRootDir/BWParam/ebi20/wf_src_HPT_tgt_AUDIT_set_01.parm
\$PMRootDir/TgtFiles/ebi20/fin/tablename.dat TPUMP_HPT_FIN_01 FLOAD_HPT_FIN_01"
SESSION_NAME=$1
PARAM_FILE=$2
LRF_NAME=$3
TPUMP_CONNECTION=$4
FLOAD_CONNECTION=$5
LOADER_CONNECTION=LoaderConnection_${SESSION_NAME}
LOG_FILE=$PMRootDir/TgtFiles/svc/ISP_NATIVE/conn_type_${SESSION_NAME}.log
echo > ${LOG_FILE}
if [ -f "${PARAM_FILE}" ]; then
if [ -f "${LRF_NAME}" ]; then
FILE_COUNT=$(echo `wc -l ${LRF_NAME} | awk -F' ' '{ print $1
}'`)
else
echo "Missing LRF File : ${LRF_NAME}. Please check" >>
${LOG_FILE}
exit 1
fi
else
echo "Missing Paramter File : ${PARAM_FILE}. Please check" >>
${LOG_FILE}
exit 1
fi
echo "Count for ${LRF_NAME} = ${FILE_COUNT}" >> ${LOG_FILE}
9
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "### place holder characters to avoid the occasional error in position
1024 ........................................................." >>
${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
echo "###
...............................................................................
................................................." >> ${PARAM_FILE}
fi
10
echo "Loader connection ${LOADER_CONNECTION} is not present. Creating
new connection type" >> ${LOG_FILE}
echo "" >> ${PARAM_FILE}
echo "[${SESSION_NAME}]" >> ${PARAM_FILE}
echo "\$${LOADER_CONNECTION}=${CONNECTION_TYPE}" >> ${PARAM_FILE}
else
echo "Loader connection ${LOADER_CONNECTION} already present. Replacing
connection type" >> ${LOG_FILE}
echo "sed
s/${LOADER_CONNECTION}=${PARM_TO_REPLACE}/${LOADER_CONNECTION}=${CONNECTION_TYP
E}/" >> ${LOG_FILE}
sed
s/${LOADER_CONNECTION}=${PARM_TO_REPLACE}/${LOADER_CONNECTION}=${CONNECTION_TYP
E}/ < ${PARAM_FILE} > ${PARAM_FILE}$$
mv ${PARAM_FILE}$$ ${PARAM_FILE}
fi
Presession_Cmd_Script.txt
$PMRootDir/ScriptFiles/dataload/cr_TD_conn_type.ksh
s_m_ISP_NATIVE_S_ASSET_LRF_INC
$PMRootDir/BWParam/svc/ISP_NATIVE/wf_SVC_ISP_NATIVE_LRF_INC.parm
$PMRootDir/TgtFiles/svc/ISP_NATIVE/lrf_S_ASSET.dat TPUMP_SERVICE_ISP_ETL_01
FLOAD_SERVICE_ISP_ETL_01
Authors
Anil Praneeth
Professional Services Consultant
11