ASAP2019

Technical Program

Presentation Guidelines

Monday July 15th (Tata Innovation Center, Room 023 & 131)
8:30-9:00	Registration
9:00-9:15	Welcome Remarks (Slides)
9:15-10:00	Keynote I (Chair: Zhiru Zhang) DNA Data Storage and Near-Molecule Processing for the Yottabyte Era Luis Ceze, University of Washington
10:10-10:50	Paper Session – Applications: Machine Learning I (Chair: Roger Moussalli, Two Sigma)
26	F-E3D: FPGA-based Acceleration of An Efficient 3D Convolutional Neural Network for Human Action Recognition (Best Paper Nominee) (Slides) Hongxiang Fan¹, Cheng Luo³, Chenglong Zeng², Martin Ferianc¹, Zhiqiang Que¹, Shuanglong Liu¹, Xinyu Niu¹, Wayne Luk¹ Imperial College London¹, Fudan University², Corerain Technologies³
80	LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism (Slides) Tong Geng^1,2, Tianqi Wang¹, Chunshu Wu¹, Chen Yang¹, Shuaiwen Leon Song², Ang Li², Martin Herbordt¹ Boston University¹, Pacific Northwest National Laboratory²
10:50-11:10	Coffee Break
11:10-11:50	Paper Session – Applications: Machine Learning II (Chair: Roger Moussalli, Two Sigma)
37	Efficient Weight Reuse for Large LSTMs (Slides) Zhiqiang Que¹, Thomas Nugent¹, Shuanglong Liu¹, Li Tian³, Xinyu Niu², Yongxin Zhu³, Wayne Luk¹ Imperial College London¹, Chinese Academy of Sciences², Corerain Technologies Ltd.³
60	Photonic Processor for Fully Discretized Neural Networks (Slides) Jeff Anderson, Shuai Sun, Yousra Alkabani, Volker Sorger, Tarek El-Ghazawi George Washington University
11:50-12:10	Lightning Session for Posters (Slides) (Chair: Cunxi Yu, University of Utah)
	Transparent Heterogeneous Cloud Acceleration (16) Jessica Vandebon¹, Jose ́ G. F. Coutinho¹, Wayne Luk¹, Thomas Chau² Imperial College London¹, Intel² CRbS：A Code Reordering based Speeding-up Method of Irregular Loops On CMP (24) Yuancheng Li, Jiaqi Shi Xi'an University of Science and Technology Impact of Structural Faults on Neural Network Performance (51) Krishna Teja Chitty-Venkata and Arun Somani Iowa State University Energy-Efficient Near-Sensor Convolution using Pulsed Unary Processing (99) M. Hassan Najafi¹, S. Rasoul Faraji², Kia Bazargan², and David Lilja² University of Louisiana Lafayette¹, University of Minnesota Minneapolis² An Efficient Application Specific Instruction Set Processor (ASIP) for Tensor Computation (101) Wei-pei Huang, Ray C.C. Cheung, Hong Yan City University of Hong Kong MITRACA: Manycore Interlinked Torus Reconfigurable Accelerator Architecture (102) Riadh Ben Abdelhamid, Yoshiki Yamaguchi, Taisuke Boku University of Tsukuba DeltaNet: Differential Binary Neural Network (103) Yuka Oba¹, Kota Ando¹, Tetsuya Asai¹, Masato Motomura², Shinya Takamaeda-Yamazaki^1,3 Hokkaido University¹, Tokyo Institute of Technology², JST PRESTO³ Using Residue Number Systems to Accelerate Deterministic Bit-stream Multiplication (106) Kamyar Givaki¹, Reza Hojabr¹, M. Hassan Najafi², Ahmad Khonsari^1,3, M. H. Gholamrezayi⁴, Saeid Gorgin⁵, Dara Rahmati⁴ University of Tehran¹, University of Louisiana Lafayette², Institute for Research in Fundamental Sciences Iran³ Shahid Beheshti University⁴, Iranian Research Organization for Science and Technology⁵ Precision Adaptation for Fast and Accurate Polynomial Evaluation Generation (108) Nicolas Brunie¹, Christoph Lauter², Guillaume Revy^3,4 Kalray S.A.¹, University of Alaska Anchorage², University of Perpignan Via Domitia³, University of Montpellier⁴
12:10-13:30	Lunch & Poster Session I Short papers (11, 49, 74, 96) and Posters (16, 24, 51, 99, 101, 102, 103, 106, 108)
13:30-14:15	Keynote II (Chair: Wayne Luk) From AI1.0, AI2.0, to XAI3.0 Sun-Yuan Kung, Princeton University
14:25-15:30	Paper Session - Architecture and Synthesis (Chair: Sang-Woo Jun, UC Irvine)
9	Maestro: A Memory-on-Logic Architecture for Coordinated Parallel Use of Many Systolic Arrays (Slides) Bradley McDanel¹, HT Kung¹, Sai Qian Zhang¹, Xin Dong¹, Chih Chiang Chen² Harvard University¹, MediaTek²
43	Sparstition: A Partitioning Scheme for Large-Scale Sparse Matrix Vector Multiplication on FPGA (Slides) Bjorn Sigurbergsson¹, Tom Hogervorst¹, Tong Dong Qiu², Razvan Nane^1,2 TU Delft¹, Big Data Accelerate²
31	End-to-end Dynamic Stream Processing on Maxeler HLS Platforms (Slides) Charalampos Kritikakis, Dirk Koch University of Manchester
11 (Short)	Sparse Matrix to Matrix Multiplication: A Representation and Architecture for Acceleration (Slides) Pareesa Ameneh Golnari¹, Sharad Malik² Google¹, Princeton University²
74 (Short)	HelmGemm: Managing GPUs and FPGAs for transprecision GEMM workloads in containerized environments (Slides) Dionysios Diamantopoulos IBM Research Zurich
15:30-15:50	Coffee Break
15:50-17:00	Paper Session – Applications: Machine Learning, Robotics, and Simulation I (Chair: Bo Yuan, Rutgers)
30	Error Analysis of the Square Root Operation for the Purpose of Precision Tuning: a Case Study on K-means (Slides) Oumaima Matoussi¹, Yves Durand¹, Olivier Sentieys², Anca Molnos¹ CEA, LETI, University Grenoble Alpes¹, Univ. Rennes²
50	FPGA Architectures for Real-time Dense SLAM (Best Paper Award) (Slides) Quentin Gautier, Alric Althoff, Ryan Kastner UC San Diego
28	Customizable Control Policy Learning for Robotics (Slides) Ce Guo¹, Wayne Luk¹, Stanley Loh Qing Shui¹, Alexander Warren², Joshua Levine² Imperial College London¹, Intel²
49 (Short)	Resilient Neural Network Training for Accelerators with Computing Errors (Slides) Dawen Xu^1,3, Kouzi Xing², Cheng Liu¹, Ying Wang¹, Yulin Dai², Long Cheng³, Huawei Li¹, Lei Zhang¹ Chinese Academy of Sciences¹, Hefei University of Technology², University College Dublin³
76 (Short)	VLIW Based Runtime Reconfigurable Machine Vision Coprocessor Architecture for Edge Computing (Slides) Dilshan Kumarathunga, Omega Gamage, Asitha Samarasinghe, Nipuna Saranga, Ranga Rodrigo, Ajith Pasqual University of Moratuwa
17:10-17:20	Announcement by ASAP'20 Chair
17:30-21:00	Reception -- Peking Duck House, 236 E 53rd St, Midtown, New York City [direction] [more info]
Tuesday July 16th (Bloomberg Center, Room 161 & 165)
09:00-09:45	Keynote III (Chair: Yun (Eric) Liang, Peking University) Heterogeneous Systems Research - in the Mood for AI in the age of Cloud and IoT Jinjun Xiong, IBM T.J. Watson Research Center
10:00–10:40	Invited Session: Hardware Acceleration (Chair: Yun (Eric) Liang, Peking University)
	Agile FPGA Design (Slides) Justin Thiel, Two Sigma
	PAI-FCNN: FPGA based inference system for complex CNN models (Slides) Lixue Xia, Lansong Diao, Zhao Jiang, Hao Liang, Kai Chen, Li Ding, Shunli Dou, Zibin Su, Meng Sun, Jiansong Zhang, Wei Lin Alibaba Group
10:40-11:00	Coffee Break
11:00-11:45	Paper Session - Applications: Image Processing, Networking, and Floating Point Arithmetic I (Chair: Guojie Luo, Peking University)
66	Event-Based Re-configurable Hierarchical Processors for Smart Image Sensors (Slides) Pankaj Bhowmik, Md Jubaer Hossain Pantho, Christophe Bobda University of Florida
69	OpenVX Graph Optimization for Visual Processor Units (Slides) Madushan Abeysinghe¹, Jesse Villarreal², Lucas Weaver², Jason D. Bakos¹ University of South Carolina¹, Texas Instruments Corporation²
44 (Short)	Application Specific Architecture for Hardware Accelerating HOG-SVM to achieve High Throughput on HD Frames (Slides) Piyumal Ranawaka¹, Mongkol Ekpanyapong², Adriano Tavares³, Jorge Cabral², Krit Athikulwongse³, Vitor Silva⁴ University of Moratuwa¹, Asian Institute of Technology², National Science and Technology Development Agency³, University of Minho⁴
11:45-12:00	Lightning Session for Posters (Slides) (Chair: Cunxi Yu, University of Utah)
	A Quantitative Approach for Refactoring NFV-based Mobile Core Networks (21) Wei-Kuo Chiang, He-Xin Chen National Chung Cheng University Fooling AI with AI: An Accelerator for Adversarial Attacks on Deep Learning Visual Classification (33) Haoqiang Guo¹, Lu Peng¹, Jian Zhang¹, Fang Qi¹, Lide Duan² Louisiana State University¹, Alibaba Group² A Virtual Image Accelerator for Graph Cuts Inference on FPGA (53) Tianqi Gao¹, Rob A. Rutenbar² University of Illinois at Urbana Champaign¹, University of Pittsburgh² Implications for Hardware Acceleration of Malware Detection (67) Jordan Pattee, Byeong Kil Lee University of Colorado, Colorado Springs GPUs Pipeline Latency Analysis (93) Yehia Arafa¹, Abdel-Hameed A. Badawy^1,2, Gopinath Chennupati², Nandakishore Santhi², Stephan Eidenbenz² New Mexico State University¹, Los Alamos National Laboratory² Context-Aware Number Generator for Deterministic Bit-stream Computing (100) Sina Asadi and M. Hassan Najafi University of Louisiana Lafayette Smart Rabbit: A Wearable Device As Intelligent Pacer for Marathon Runners (110) Wenpei Zheng, Sheng-Yang Chiu, Jui-Chien Hsieh, Chaochang Chiu Yuan Ze University
12:00-13:20	Lunch & Poster Session II Short papers (17, 44, 57, 68, 73, 95) and Posters (21, 33, 53, 67, 93, 100, 110)
13:20-14:20	Invited Session: In Memory Computing (Chair: Zhiru Zhang, Cornell University)
	Real Processing-in-Memory with Memristive Memory Processing Unit (Slides) Shahar Kvatinsky, Technion – Israel Institute of Technology
	PPAC: A Versatile In-Memory Accelerator for Matrix-Vector-Product-Like Operations (Slides) Oscar Castañeda, Maria Bobbett, Alexandra Gallyas-Sanhueza, and Christoph Studer Cornell University
	Parallel Stateful Logic in RRAM: Theoretical Analysis and Arithmetic Design (Slides) Feng Wang, Guojie Luo, Guangyu Sun, Jiaxi Zhang, Peng Huang, Jinfeng Kang Peking University
14:30-15:20	Paper Session – Applications: Machine Learning, Robotics, and Simulation II (Chair: Jieming Yin, AMD)
71	An Overlay Architecture for High-Throughput Pattern Matching (Slides) Rasha Karakchi, Charles A. Daniels, Jason D. Bakos University of South Carolina
59	Towards Real Time Radiotherapy Simulation (Slides) Nils Voss^1,3, Peter Ziegenhein³, Lukas Vermond^2,4, Joost Hoozemans², Oskar Mencer², Uwe Oelfke³, Wayne Luk¹, Georgi Gaydadjiev^1,2,4 Imperial College London¹, Maxeler Technologies², The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust³, Delft University of Technology⁴
68 (Short)	Accelerating AP3M-Based Computational Astrophysics Simulations with Reconfigurable Clusters (Slides) Tianqi Wang¹, Tong Geng², Xi Jin¹, Martin Herbordt² University of Science and Technology of China¹, Boston University²
17 (Short)	A Programmable Architecture for Robot Motion Planning Acceleration (Slides) Sean Murray¹, Will Floyd-Jones¹, George Konidaris², Dan Sorin¹ Duke University¹, Brown University²
15:20-15:40	Coffee Break
15:40-16:40	Paper Session - Emerging Technologies (Chair: Jason Bakos , University of South Carolina)
109	Leveraging Energy Cycle Regularity to Predict Adaptive Mode for Non-volatile Processors (Slides) Zejun Shi¹, Dongqin Zhou², Keni Qiu², Jiwu Shu¹ Tsinghua University¹, Capital Normal University²
58	An Adaptive Memory Management Strategy Towards Energy Efficient Machine Inference in Event-Driven Neuromorphic Accelerators (Slides) Saunak Saha, Henry Duwe, Joseph Zambreno Iowa State University
10	Improving Emulation of Quantum Algorithms using Space-Efficient Hardware Architectures (Slides) Naveed Mahmud, Esam El-Araby University of Kansas
95 (Short)	Combining Clock and Voltage Noise Countermeasures against Power Side-Channel Analysis (Slides) Jacqueline Lagasse, Christopher Bartoli, Wayne Burleson University of Massachusetts Amherst
16:40-17:30	Paper Session - Applications: Image Processing, Networking, and Floating Point Arithmetic II (Chair: Byeong Kil Lee, University of Colorado)
78	Investigating the Feasibility of FPGA-based Network Switches (Slides) Jiuxi Meng¹, Nadeen Gebara¹, Ho-Cheung Ng¹, Paolo Costa², Wayne Luk¹ Imperial College London¹, Microsoft²
12	A Well-Equipped Implementation: Normal/Denormalized Half/Single/Double Precision IEEE 754 Floating-Point Adder/Subtracter (Slides) Brett Mathis, James E. Stine Oklahoma State University
73 (Short)	SMPTE ST 2110 Compliant Scalable Architecture on FPGA for end to end Uncompressed Professional Video Transport over IP Networks (Slides) Nisal Ranasinghe¹, Ravindu Bangamuarachchi¹, Jayath Seneviratne¹, Achini Jayawardane¹, R.M.A.U. Senarath², Ajith Pasqual¹ University of Moratuwa¹, Paraqum Technologies²
20:00-21:30	Social Event -- Top of the Rock, 30 Rockfeller Plaza, Enter on West 50th Street, New York City [direction] [more info]
Wednesday July 17th (Tata Innovation Center, Room 023 & 131)
09:00-09:40	Keynote IV (Chair: Zhiru Zhang, Cornell University) Towards High-Level Approaches to Hardware Cyber Security (Slides) Ramesh Karri, NYU
09:40-10:20	Invited Session: Hardware Acceleration II (Chair: Yun Eric Liang, Peking University)
	Graph-Morphing: Exploiting Hidden Parallelism of Non-Stencil Computation in High-Level Synthesis (Slides) Mingjie Lin, University of Central Florida
	Understanding Performance Gains of Accelerator-rich Architectures (Slides) Zhenman Fang¹, Farnoosh Javadi², Jason Cong², Glenn Reinman² Simon Fraser University¹, UCLA²
10:20-10:40	Coffee Break
10:40-11:20	Paper Session - Design Methodologies I (Chair: Zhenman Fang, Simon Fraser University)
18	Base64 Encoding on Heterogeneous Computing Platforms (Slides) Zheming Jin, Hal Finkel Argonne National Laboratory
25	Statistical Performance Prediction for Multicore Applications Based on Scalability Characteristics (Slides) Oliver Jakob Arndt, Matthias Lüders, Holger Blume Leibniz University Hannover
11:20-12:20	Paper Session - Design Methodologies II (Chair: Zhenman Fang, Simon Fraser University)
48	Molecular Dynamics Range-Limited Force Evaluation Optimized for FPGAs (Slides) Chen Yang¹, Tong Geng¹, Tianqi Wang^1,2, Jiayi Sheng⁴, Charles Lin³, Vipin Sachdeva³, Woody Sherman³, Martin Herbordt¹ Boston University¹, University of Science and Technology of China², Silicon Therapeutics³, Falcon Computing⁴
39	Refine and Recycle: A Method to Increase Decompression Parallelism (Slides) Jian Fang¹, Jianyu Chen¹, Jinho Lee², Zaid Al-Ars¹, H.Peter Hofstee^1,2 Delft University of Technology¹, IBM Austin²
62	Efficient Architectures and Implementation of Arithmetic Functions Approximation Based Stochastic Computing (Slides) Tieu-Khanh Luong¹, Van-Tinh Nguyen², Anh-Thai Nguyen³, Emanuel Popovici¹ University College Cork¹, Nara Institute of Science and Technology², Le Quy Don Technical University³
57 (Short)	Bank-Selective Strategy for Gate-based Ternary Content Addressable Memory on FPGAs (Slides) Muhammad Irfan¹, Zahid Ullah², Ray C. C. Cheung² City University of Hong Kong¹, CECOS University of IT & Emerging Sciences²
12:20-12:30	Closing Remarks and Best Paper Award Announcement (Slides)
12:40-13:30	Lunch

Presentation guidelines

Long Papers You will be allocated 18 (15 mins presentation + 3 mins Q & A) minutes slot to present your paper.
Short Papers You will be allocated 5 minutes slot to present your paper with NO questions. You should focus on the key motivation and result and encourage the listeners to come your poster to discuss details with you.
Poster Papers You will be allocated 90 secs (1 technical slide limit) in the lightning poster session to present your poster paper. You should focus on the key motivation and encourage the listeners to come your poster to discuss details with you. The Monday lightning poster session (July 15) includes posters: 16, 24, 51, 99, 101, 102, 103, 106, 108. The Tuesday lightning poster session (July 16) includes posters: 21, 33, 53, 67, 93, 100, 110.
Short and Poster Papers In addition to giving a short presentation at the conference, you are also expected to present a poster of your paper at the Poster Session during lunch. The poster dimensions should be standard A1. You can find your poster session on the ASAP web page https://asap2019.csl.cornell.edu/program.html.

Technical Program

Presentation Guidelines

Presentation guidelines

sponsors