Intelligent Speech and Acoustic Signal Processing Laboratory at University of Electro-Communications
Journal Papers (学術論文)
2021
Toru Nakashika and Kohei Yatabe
Gamma Boltzmann Machine for Audio Modeling
IEEE/ACM Transactions on Audio Speech and Language Processing, Vol.29, pp.2591-2605, 2021.
2020
Takuya Kishida and Toru Nakashika
Speech chain VC: linking linguistic and acoustic levels via latent distinctive features for RBM-based voice conversion
IEICE TRANSACTIONS on Information and Systems, Vol.E103-D, No.11, pp.1-11, August 2020.
2019
Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki
Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition
EURASIP Journal on Audio, Speech, and Music Processing, DOI: 10.1186/s13636-019-0160-1, 1-11,
pp.1-11, August 2019.
Kentaro Sone and Toru Nakashika
Pre-Training of DNN-Based Speech Synthesis Based on Bidirectional Conversion between Text and
Speech
IEICE TRANSACTIONS on Information and Systems, Vol.E102-D, No.8, pp.1546-1553, August 2019.
2018
Toru Nakashika, Shinji Takaki, and Junichi Yamagishi
Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From
Complex Spectra
IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol.27, No.2, pp.244-254, Oct. 2018.
2017
Toru Nakashika and Yasuhiro Minami
Speaker-adaptive-trainable Boltzmann machine and its application to non-parallel voice
conversion
EURASIP Journal on Audio, Speech, and Music Processing, DOI: 10.1186/s13636-017-0112-6, pp.1-10,
June 2017.
Toru Nakashika
Deep Relational Model: A Joint Probabilistic Model with a Hierarchical Structure for Bidirectional
Estimation of Image and Labels
IEICE Transactions on Information and Systems, Vol.E101-D, No.2, pp.428-436, Feb. 2018.
2016
Toru Nakashika, Tetsuya Takiguchi, and Yasuhiro Minami
Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine
IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol.24, No.11, pp.2032-2045, Nov.
2016.
Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
Phone Labeling Based on the Probabilistic Representation for Dysarthric Speech Recognition
American Journal of Signal Processing, Vol. 6, No. 1, pp. 19-23, doi:10.5923/j.ajsp.20160601.03,
June 2016.
2015
Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki
Voice conversion using speaker-dependent conditional restricted Boltzmann machine link
EURASIP Journal on Audio, Speech, and Music Processing 2015, 2015:8, DOI 10.1186/s13636-014-0044-3,
12 pages, February 2015.
Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki
Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines link
IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol.23, No.3, pp.580-587, March
2015.
2014
Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki
Probabilistic spectral envelope modeling of musical instruments within the non-negative matrix
factorization framework for mixed music analysis link
Acoustical Science and Technology, Vol.35, No.4, pp.181-191, July 2014.
Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki
Parallel Dictionary Learning Using a Joint Density Restricted Boltzmann Machine for
Sparse-Representation-Based Voice Conversion
Advances in Computer Science and Engineering, Vol.12, No.2, pp.101-117, June 2014.
Toru Nakashika, Toshiya Yoshioka, Tetsuya Takiguchi, Yasuo Ariki, Stefan Duffner, and Christophe
Garcia
Convolutive Bottleneck Network with Dropout for Dysarthric Speech Recognition
Transactions on Machine Learning and Artificial Intelligence, Vol.2, No.2, pp.46-60, April 2014.
Toru Nakashika, Takeshi Okumura, Tetsuya Takiguchi, and Yasuo Ariki
Hierarchical Sparse Representation for Object Recognition
Transactions on Machine Learning and Artificial Intelligence, Vol.2, No.1, pp.46-60, February 2014.
Toru Nakashika, Takafumi Hori, Tetsuya Takiguchi, and Yasuo Ariki
Depth Spatial Pyramid: a Pooling Method for 3D-Object Recognition
Advances in Computer Science and Engineering, Vol.12, No.1, pp.15-30, 2014.
Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki
Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines
IEICE Transactions on Information and Systems, Vol.E97-D, No.6, pp.1403-1410, June 2014.
2013
Daiki Nishimura, Toru Nakashika, Tetsuya Takiguchi, and Yasuo Ariki
Mixed Music Analysis with Extended Specmurt
Journal of software engineering and applications, Volume 6, Number 5, pp. 274-279, May 2013.
Daiki Nishimura, Toru Nakashika, Tetsuya Takiguchi, and Yasuo Ariki
Sparseness Criteria of F0-Frequencies Selection for Specmurt-Based Multi-Pitch Analysis without
Modeling Harmonic Structure
Journal of Signal Processing, Vol. 17, No. 2, pp.29-38, March 2013
International Conferences (国際会議論文)
2022
Kotaro Onishi, Toru Nakashika, "MoCoVC: Non-Parallel Voice Conversion With Momentum Contrastive Representation Learning," Proc. APSIPA, pp. 1438-1443, Nov. 2022.
Kotaro Onishi, Toru Nakashika, "Consistency Regularization for GAN-Based Neural Vocoders," Proc. APSIPA, pp. 1131-1136, Nov. 2022.
Takumi Isako, Kotaro Onishi, Takuya Kishida, Toru Nakashika,
"Controllable Voice Conversion Based on Quantization of Voice Factor Scores,"
Proc. APSIPA, pp. 1444-1448, Nov. 2022.
2020
Toru Nakashika and Kohei Yatabe,
"Gamma Boltzmann Machine for Simultaneously Modeling Linear- and Log-amplitude Spectra,"
Proceedings of APSIPA Annual Summit and Conference 2020, pp. 471-476 December 2020.
Toru Nakashika,
"Complex-Valued Variational Autoencoder: A Novel Deep Generative Model for Direct Representation of Complex Spectra,"
Proceedings of the Interspeech 2020, pp. 2002-2006, October 2020.
Takuya Kishida, Shin Tsukamoto and Toru Nakashika,
"Simultaneous Conversion of Speaker Identity and Emotion Based on Multiple-Domain Adaptive RBM,"
Proceedings of the Interspeech 2020, pp. 3431-3435, October 2020.
Michel Pezzat, Hector Perez-Meana, Toru Nakashika and Mariko Nakano,
"Many-to-Many Symbolic Multi-track Music Genre Transfer,"
Proceedings of the SoMeT 2020, pp. 272-281, September 2020.
2019
Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki,
"Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition,"
EURASIP Journal on Audio, Speech, and Music Processing, DOI: 10.1186/s13636-019-0160-1, pp. 1-11, August 2018.
Shinji Takaki, Toru Nakashika, Xin Wang and Junichi Yamagishi,
"STFT spectral loss for training a neural speech waveform model,"
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019), pp. 7065-7069, May 2019.
2018
Kentaro Sone and Toru Nakashika,
"DNN-based Speech Synthesis for Small Data Sets Considering Bidirectional Speech-Text Conversion,"
Proceedings of the Interspeech 2018, pp. 2519-2523, September 2018.
Toru Nakashika,
"LSTBM: A Novel Sequence Representation of Speech Spectra Using Restricted Boltzmann Machine with Long Short-Term Memory,"
Proceedings of the Interspeech 2018, pp. 2529-2533, September 2018.
Kentaro Sone, Shinji Takaki and Toru Nakashika,
"Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model,"
Proceedings of the Odyssey 2018, pp. 261-266, June 2018.
Yuki Takashima, Hajime Yano, Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki,
"Parallel-Data-Free Dictionary Learning for Voice Conversion Using Non-Negative Tucker Decomposition,"
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), pp. 5294-5298, April 2018.
2017
Toru Nakashika and Eriko Aiba,
"Practice Process Analysis Using Score Matching Method Based on OBE-DTW and its Effects on Memorizing Musical Score,"
Proceedings of International Symposium on Performance Science 2017 (ISPS2017), pp. 66-67, September 2017.
Toru Nakashika,
"CAB: An energy-based speaker clustering model for rapid adaptation in non-parallel voice conversion,"
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, pp. 3369-3373, 2017.
Toru Nakashika, Shinji Takaki and Junichi Yamagishi,
"Complex-valued restricted Boltzmann machine for direct learning of frequency spectra,"
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, pp. 4021-4025, 2017.
2016
Toru Nakashika and Yasuhiro Minami,
"3WRBM-Based Speech Factor Modeling for Arbitrary-Source and Non-Parallel Voice Conversion,"
Interspeech 2016, pp. 1487-1491, September 2016.
Zhaojie Luo, Jinhui Chen, Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki,
"Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform,"
The 9th ISCA Speech Synthesis Workshop (SSW), pp. 153-158, September 2016.
Toru Nakashika, Tetsuya Takiguchi and Yasuhiro Minami,
"Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine,"
IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 24, No. 11, pp. 2032-2045, August 2016.
Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki,
"Phone Labeling Based on the Probabilistic Representation for Dysarthric Speech Recognition,"
American Journal of Signal Processing, Vol. 6, No. 1, pp. 19-23, June 2016.