jmlr jmlr2012 jmlr2012-106 jmlr2012-106-reference knowledge-graph by maker-knowledge-mining

106 jmlr-2012-Sign Language Recognition using Sub-Units


Source: pdf

Author: Helen Cooper, Eng-Jon Ong, Nicolas Pugeault, Richard Bowden

Abstract: This paper discusses sign language recognition using linguistic sub-units. It presents three types of sub-units for consideration; those learnt from appearance data as well as those inferred from both 2D or 3D tracking data. These sub-units are then combined using a sign level classifier; here, two options are presented. The first uses Markov Models to encode the temporal changes between sub-units. The second makes use of Sequential Pattern Boosting to apply discriminative feature selection at the same time as encoding temporal information. This approach is more robust to noise and performs well in signer independent tests, improving results from the 54% achieved by the Markov Chains to 76%. Keywords: sign language recognition, sequential pattern boosting, depth cameras, sub-units, signer independence, data set


reference text

Y. Amit and D. Geman. Shape quantization and recognition with randomized trees. Neural Computation, 9:1545–1588, 1997. L. Breiman. Random forests. Machine Learning, pages 5–32, 2001. British Deaf Association. Dictionary of British Sign Language/English. Faber and Faber, 1992. P. Buehler, M. Everingham, and A. Zisserman. Learning sign language by watching TV (using weakly aligned subtitles). In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2961 – 2968, Miami, FL, USA, June 20 – 26 2009. H. Cooper and R. Bowden. Large lexicon detection of sign language. In Proceedings of the IEEE International Conference on Computer Vision: Workshop Human Computer Interaction, pages 88 – 97, Rio de Janario, Brazil, October 16 – 19 2007. doi: 10.1007/978-3-540-75773-3 10. H. Cooper and R. Bowden. Sign language recognition using linguistically derived sub-units. In Proceedings of the Language Resources and Evaluation ConferenceWorkshop on the Representation and Processing of Sign Languages : Corpora and Sign Languages Technologies, Valetta, Malta, May17 – 23 2010. R. Elliott, J. Glauert, J. Kennaway, and K. Parsons. D5-2: SiGML Definition. ViSiCAST Project working document, 2001. H. Ershaed, I. Al-Alali, N. Khasawneh, and M. Fraiwan. An arabic sign language computer interface using the xbox kinect. In Annual Undergraduate Research Conf. on Applied Computing, May 2011. Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the European Conference on Computational Learning Theory, pages 23 – 37, Barcelona, Spain, March 13 – 15 1995. Springer-Verlag. ISBN 3-540-59119-2. J.W. Han, G. Awad, and A. Sutherland. Modelling and segmenting subunits for sign language recognition based on hand motion analysis. Pattern Recognition Letters, 30(6):623 – 633, April 2009. T Hanke and C Schmaling. Sign Language Notation System. Institute of German Sign Language and Communication of the Deaf, Hamburg, Germany, January 2004. URL http://www.signlang.uni-hamburg.de/projects/hamnosys.html. M. K. Hu. Visual pattern recognition by moment invariants. IRE Transactions on Information Theory, IT-8:179–187, February 1962. T. Kadir, R. Bowden, E.J. Ong, and A Zisserman. Minimal training, large lexicon, unconstrained sign language recognition. In Proceedings of the BMVA British Machine Vision Conference, volume 2, pages 939 – 948, Kingston, UK, September 7 – 9 2004. S Kim and M.B Waldron. Adaptation of self organizing network for ASL recognition. In Proceedings of the Annual International Conference of the IEEE Engineering in Engineering in Medicine and Biology Society, pages 254 – 254, San Diego, California, USA, October 28 – 31 1993. 2229 C OOPER , P UGEAULT, O NG AND B OWDEN W.W. Kong and S. Ranganath. Automatic hand trajectory segmentation and phoneme transcription for sign language. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition, pages 1 – 6, Amsterdam, The Netherlands, September 17 – 19 2008. doi: 10.1109/AFGR.2008.4813462. S.K. Liddell and R.E Johnson. American sign language: The phonological base. Sign Language Studies, 64:195 – 278, 1989. K. Lyons, H. Brashear, T. L. Westeyn, J. S. Kim, and T. Starner. Gart: The gesture and activity recognition toolkit. In Proceedings of the International Conference HCI, pages 718–727, July 2007. E. J. Ong and R. Bowden. Learning sequential patterns for lipreading. In Proceedings of the BMVA British Machine Vision Conference, Dundee, UK, August 29 – September 10 2011. OpenNI User Guide. OpenNI organization, November 2010. Last viewed 20-04-2011 18:15. V. Pitsikalis, S. Theodorakis, C. Vogler, and P. Maragos. Advances in phonetics-based sub-unit modeling for transcription alignment and sign language recognition. In Proceedings of the International Conference IEEE Computer Society Conference on Computer Vision and Pattern RecognitionWorkshop : Gesture Recognition, Colorado Springs, CO, USA, June 21 – 23 2011. Prime SensorTM NITE 1.3 Algorithms notes. PrimeSense Inc., 2010. Last viewed 20-04-2011 18:15. A. Roussos, S. Theodorakis, V. Pitsikalis, and P. Maragos. Hand tracking and affine shapeappearance handshape sub-units in continuous sign language recognition. In Proceedings of the International Conference European Conference on Computer VisionWorkshop : SGA, Heraklion, Crete, September 5 – 11 2010. J. E. Shoup. Phonological aspects of speech recognition. In Wayne A. Lea, editor, Trends in Speech Recognition, pages 125 – 138. Prentice-Hall, Englewood Cliffs, NJ, 1980. T. Starner and A. Pentland. Real-time american sign language recognition from video using hidden markov models. Computational Imaging and Vision, 9:227 – 244, 1997. W.C Stokoe. Sign language structure: An outline of the visual communication systems of the american deaf. Studies in Linguistics: Occasional Papers, 8:3 – 37, 1960. P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 1, pages 511 – 518, Kauai, HI, USA, December 2001. C. Vogler and D Metaxas. Adapting hidden markov models for ASL recognition by using threedimensional computer vision methods. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, volume 1, pages 156 – 161, Orlando, FL, USA, October 12 – 15 1997. C. Vogler and D Metaxas. Parallel hidden markov models for american sign language recognition. In Proceedings of the IEEE International Conference on Computer Vision, volume 1, pages 116 – 122, Corfu, Greece, September 21 – 24 1999. 2230 S IGN L ANGUAGE R ECOGNITION USING S UB -U NITS M. B. Waldron and S Kim. Increasing manual sign recognition vocabulary through relabelling. In Proceedings of the IEEE International Conference on Neural Networks IEEE World Congress on Computational Intelligence, volume 5, pages 2885 – 2889, Orlando, Florida, USA, June 27 – July 2 1994. doi: 10.1109/ICNN.1994.374689. M. B. Waldron and S Kim. Isolated ASL sign recognition system for deaf persons. IEEE Transactions on Rehabilitation Engineering, 3(3):261 – 271, September 1995. doi: 10.1109/86.413199. M.B. Waldron and D Simon. Parsing method for signed telecommunication. In Proceedings of the Annual International Conference of the IEEE Engineering in Engineering in Medicine and Biology Society: Images of the Twenty-First Century, volume 6, pages 1798 – 1799, Seattle, Washington, USA, November 1989. doi: 10.1109/IEMBS.1989.96461. H. Wassner. kinect + reseau de neurone = reconnaissance de gestes. http://tinyurl.com/5wbteug, May 2011. P. Yin, T. Starner, H. Hamilton, I. Essa, and J.M. Rehg. Learning the basic units in american sign language using discriminative segmental feature selection. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4757 – 4760, Taipei, Taiwan, April 19 – 24 2009. doi: 10.1109/ICASSP.2009.4960694. Z. Zafrulla, H. Brashear, P. Presti, H. Hamilton, and T. Starner. Copycat - center for accessible technology in sign. http://tinyurl.com/3tksn6s, December 2010. URL http://www.youtube. com/watch?v=qFH5rSzmgFE&feature;=related. Z. Zafrulla, H. Brashear, T. Starner, H. Hamilton, and P. Presti. American sign language recognition with the kinect. In Proceedings of the 13th International Conference on Multimodal Interfaces, ICMI ’11, pages 279–286, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0641-6. doi: 10.1145/2070481.2070532. URL http://doi.acm.org/10.1145/2070481.2070532. 2231