An Approach For Logo Detection and Retrieval
An Approach For Logo Detection and Retrieval
in Documents
1 Introduction
Logos are the brand ambassadors of each and every organization whether it is a
business or a government, it promotes their ideas or products with the help of
the logo by investing millions of rupees on it. In every government and business
document in the world has a logo which plays an important role in providing
the source information of the document and in identifying the particular organi-
zation. Logo detection and recognition has become a hot topic in the Document
Image Analysis and Recognition (DIAR) and pattern recognition. Computer
vision methodologies and pattern recognition techniques are used in the process
of automatic logo recognition and computer aided techniques makes it much
easier. Logos in the document convey a lot of information like, to which orga-
nization does the document etc. Often logos come in different shapes, forms;
dimensions and other complexities like, few are made of only text, graphics and
in the combination of both text and graphics. These features of the logo help to
c Springer Nature Singapore Pte Ltd. 2017
K.C. Santosh et al. (Eds.): RTIP2R 2016, CCIS 709, pp. 49–58, 2017.
DOI: 10.1007/978-981-10-4859-3 5
50 Y.H. Sharath Kumar and K.C. Ranjith
differentiate from other content in the body of the document. Features of logo
contribute in logo detection, retrieval and matching in document.
2 Literature Survey
Here we review the papers that related to logo detection and retrieval. Viet
et al. [1] have presented methodology for digital document categorization based
on logo spotting. The logos are recognized using key point matching. Initially,
the logos are segmented using spatial density-based clustering. Stefan et al. [2]
have proposed a highly effective and scalable framework for recognizing logos in
images. Alireza et al. [3] proposed a coarse-to-fine logo detection scheme for doc-
ument images. The content of a document image is pruned by utilizing a decision
tree. The Nearest Neighbour (NN) classifier is used for the purpose of classifica-
tion. Divya and Padmalatha [4] have proposed a technique which can recognize
different instances logos. Here the logo classification is depends on the descrip-
tion of a Context Dependent Similarity (CDS) kernel. Baiying et al. [5] have
worked on the classification of merchandise logos with the combination of local
edge-based Descriptor, spatial histogram and salient region detection. Guangyu
et al. [6] presented a automatic retrieval system for document images. SIFT fea-
tures with Kd-tree indexing algorithm is used for efficient logo retrieval. Rajiv
et al. [7] have developed a retrieval system for logo in document images. SURF
features with indexing algorithm are used for efficient logo retrieval. Guangyu
et al. [8] has proposed an automatic technique to identify logos from docu-
ments. Initially, the logos are segmented using boosting algorithm. The SIFT
features are extracted from the detected logos. The KD-tree indexing algorithm
is used for efficient logo retrieval. Shridevi and Dhandra [9] developed a model
for automatic detection and recognition of logos. Marcal et al. [10] have proposed
a method which classifies the documents such as receipts or bills using bag-of-
words features. Hongye et al. [11] proposed an algorithm for logo detection based
on boundaries of the logos. The detected logos are classified using decision tree
classifier. Marcal et al. [12] proposed two models for logo detection and classifi-
cation system. The first method deals with bag-of-visual-words and second with
sliding-window technique. The extracted features are fed into support vector
machine (SVM) classifier. Marcal et al. [13] designed an approach for group-
ing and indexing digital logo libraries which are similar to the trademark and
patent offices. A queried-by-example retrieval system is proposed, which is able
to fetch logos from dataset based on similarity of images. Kuo-Wei et al. [14]
presented a logo classification system which extracts features like histograms
of oriented gradient (HOGE) and scale invariant feature transform (ASIFT).
The extracted features are fed into support vector machine (SVM) classifier.
Xiaobing et al. [15] developed a method that focuses on collecting representative
logo images automatically without human labeling or the seed images from the
internet. Nishanth et al. [16] designed a technique for classifying the logos based
on Context-Dependent Similarity (CDS) kernel. Souvik [17] studied shape based
feature values of logo images. Mohammadreza et al. [18] discussed the various
An Approach for Logo Detection and Retrieval in Documents 51
types of algorithms and their result on retrieval of document images and pro-
posed a framework for classifying the retrieval approaches of document images.
Stefan et al. [19] proposed a system for identify the logos in images using local
features and spatial structure which is composition of triangles and edges. David
et al. [20] proposed a method for recognizing logo with the help of multi-level
staged approach which is combination of global and local fine invariants. Aya
et al. [21] have described an approach for logo representation based on positive
and negative shape features. The organization of paper is as follows. In Sect. 3
the proposed method is explained with neat block diagram along with brief
introduction to log detection and central moments. Feature extraction method
in Sect. 3.3. In Sect. 3.4, brief description of feature reduction using PCA. The
experimental results are discussed in Sect. 5 and paper is concluded in Sect. 6.
3 Proposed Work
The proposed method contains Logo detection, Feature Extraction, Dimension-
ality Reduction and Experimentation for both detection and Retrieval. The fol-
lowing subsection gives the brief description of each method.
The proposed method for detection of logos in a given document is done in three
different stages. In the first stage outer boundary and background are elimi-
nated. The second stage detects lines in the document. In final stage the logos
are detected using central moment variations. After logos extraction an evalu-
ation process is performed on detection of logos, we asked five human experts
identify the logos in documents by drawing rectangular box. Later we match
the co-ordinates between the logos identified by proposed method and human
experts. The block diagram in Fig. 1 shows computational process involved in
the proposed methodology.
Fig. 2. (a) Shows a document with background art and logo. (b) Shows result of
document image after eliminating smaller components.
Where x̄ and ȳ are the centroid of the image of size m × n, p and q are the
order of moments in x and y direction respectively, f(x, y) is the intensity value
at given (x, y) coordinates. The logo regions have less moment variations in the
document, the moment variations in the document are identified using central
moments.
n
μ30 = Σx=1 (x − x̄)p f (x, y) (2)
An Approach for Logo Detection and Retrieval in Documents 53
For each component of an image, h values of μ30 are obtained from Eq. (2),
where h being the height of the corresponding component. Similar repeated high
central moment values indicate the presence of logos in the documents. If μk30 is
the central moment value for kth component, then > threshold describes a in the
kth component. High central moment’s values for each component are decided
based on a threshold value. The threshold value for a component is obtained by
assuming an imaginary line for each component and the central moment for the
imaginary line is computed. Threshold for the component is fixed as 50 of central
moment of imaginary line in that component. The components containing logos
are identified and labeled as shown in Fig. 3.
Fig. 3. Shows a input image and components containing logos are identified using
central moments.
4 Dataset
In this work we have created our own database despite an existence of other data-
base. The data set consists of 500 real document images. The samples include
conference certificates, attendance certificates, degree certificates, transfer cer-
tificates, etc. The copies were scanned using an hp flat-bed scanner to produce
bitmap images at 300 dpi. Figure 4 shows some of the samples of logos collected
and Fig. 5 shows the samples of document images.
5 Experimentation
Fig. 6. Shows some samples of documents with a minimum bounding rectangular box
fixed by the proposed logo detection method.
Fig. 7. The overall matching score of the proposed logo detection method due to all
500 images of documents against each expert
Fig. 8. Presents some examples of document images with detected logo parts by the
proposed method and the corresponding ground truth marked by the five experts HE1,
HE2, HE3 HE4 and HE5 with the calculated matching scores
56 Y.H. Sharath Kumar and K.C. Ranjith
6 Conclusion
In this paper, we have proposed a novel method to identify the logos using
Central moments. We have conducted experimentation on our own dataset.
To corroborate the efficiency of the proposed method we have created ground
truth where five human experts have identified the logos by drawing rectangular
bounding box manually. Later we matched the bounding box drawn by the pro-
posed method with bounding box of human experts to study the error analysis.
For detected logos, the SIFT features are extracted and reduced using PCA. K-d
tree is used to for logo retrieval.
References
1. Le, V.P., Nayef, N., Visani, M., Ogier, J.-M., De Tran, C.: Document retrieval based
on logo spotting using key-point matching. In: 22nd International Conference on
Pattern Recognition (ICPR), pp. 3056–3061 (2014)
An Approach for Logo Detection and Retrieval in Documents 57
2. Romberg, S., Pueyo, L.G., Lienhart, R., van Zwol, R.: Scalable logo recognition
in real-world images. In: ACM International Conference on Multimedia Retrieval
(ICMR 2011), Trento, April 2011
3. Alaei, A., Delalandre, M., Girard, N.: Logo detection using painting based repre-
sentation and probability features. In: Proceedings of 12th International Confer-
ence on Document Analysis and Recognition (ICDAR 2013), pp. 1267–1271. IEEE
Computer Society (2013)
4. Divya Susmitha, C., Padmalatha, L.: Context dependent logo detection and recog-
nition based on context dependent similarity kernel. Int. J. Comput. Appl. 106(11),
November 2014. (0975–8887)
5. Lei, B., Thing, V.L.L., Chen, Y., Lim, W.-Y.: Logo classification with edge-based
DAISY descriptor. In: 2012 IEEE International Symposium on Multimedia (ISM),
pp. 222–228, 10–12 December 2012
6. Zhu, G., Doermann, D.: Logo matching for document image retrieval. In: 10th
International Conference on Document Analysis and Recognition (2009)
7. Jain, R., Doermann, D.: Logo retrieval in document images. In: Proceedings of the
2012 10th IAPR International Workshop on Document Analysis Systems, DAS
2012, pp. 135–139 (2012)
8. Zhu, G., Doremann, D.: Automatic document logo detection. In: ICDAR, pp. 864–
868 (2007)
9. Soma, S., Dhandra, B.V.: Automatic logo recognition system from the complex
document using shape and moment invariant features. Int. J. Adv. Comput. Sci.
Technol. (IJACST) 4(2), 06–13 (2015)
10. Rusinol, M., Llados, J.: Logo spotting by a bag-of-words approach for document
categorization. In: ICDAR, pp. 111–115 (2009)
11. Wang, H., Chen, Y.: Logo detection in document images based on boundary exten-
sion of feature rectangles. In: ICDAR, pp. 1335–1339 (2009)
12. Rusinol, M., D’Andecy, V.P., Karatzas, D., Llados, J.: Classification of adminis-
trative document images by logo identification. In: Proceedings of the 9th ICGR
(2011). doi:10.1007/978-3-642-36824-0 5
13. Rusiñol, M., Lladós, J.: Efficient logo retrieval through hashing shape context
descriptors. In: DAS 2010, Boston, MA, USA, 9–11 June 2010
14. Li, K.-W., Chen, S.-Y., Su, S., Duh, D.-J., Zhang, H., Li, S.: Logo Detection with
Extendibility and Discrimination. Springer Science+Business Media, New York
(2013)
15. Liu, X., Zhang, B.: Automatic collecting representative logo images from the inter-
net. Tsinghua Sci. Technol. 18(6), 606–617 (2013)
16. Nishanth, T.R., Simon, J.: Logo Matching, recognition with interest points using
context-dependent similarity. In: International Conference on Humming Bird,
March 2014. International Journal of Engineering Research, Applications (IJERA),
ISSN 2248–9622
17. Datta, S.: Dissertation on logo recognition using moment invariants. Thesis for
Master in Multimedia Development (2013)
18. Keyvanpour, M., Tavoli, R.: Document image retrieval: algorithms, analysis and
promising directions. Int. J. Softw. Eng. Appl. 7(1), 93–106 (2013)
19. Romberg, S., Pueyo, L.G., Lienhart, R., van Zwol, R.: Scalable logo recognition in
real-world images. In: Proceedings of ACM International Conference on Multime-
dia Retrieval. ACM (2011)
20. Doermann, D.S., Rivlin, E., Weiss, I.: Logo recognition using geometric in-variants,
pp. 894–897. IEEE (1993)
58 Y.H. Sharath Kumar and K.C. Ranjith
21. Soffer, A., Samet, H.: Using negative shape features for logo similarity matching. In:
14th International Conference on Pattern Recognition, Brisbane, Australia (1998)
22. Folkers, A., Samet, H.: Content-based image retrieval using Fourier descriptors on
a logo database. In: Proceedings of the 16th International Conference on Pattern
Recognition, Quebec City, Canada, Vol. III, pp. 521–524, August 2002
23. Canny. J.: A computational approach to edge detection. IEEE Trans. Pattern Anal.
Mach. Intell. 8(6), 679–698 (1986)