130 likes | 249 Views
The NLP Group (WING) at the National University of Singapore is involved in cutting-edge research and projects, including data cleaning in the cloud, text mining of clinical and scientific articles, and the development of recommender systems. Under the mentorship of Min-Yen Kan, we focus on diverse areas such as record matching, question answering, and crowdsourcing for machine translation. Our collaborative efforts enable knowledge sharing, aiming to advance technologies that improve digital libraries and the utilization of NLP in various fields.
E N D
Web IR / NLP Group(WING) Architecture Min-Yen Kan School of Computing National University of Singapore
Projects Funded • CSIDM (CAS, China): Aobo, CSIDM Interns • ForeCite (Expires Oct 2010): Kaz, Emma, Thang Proposed • Data Cleaning in the Cloud (UCI) • Text Mining Clinical Articles (Duke-NUS / UCI) • Shreyasee, Justin, • Text Mining Scientific Articles (Global Asia Institute) • ForeCite2 WING, NUS
DL IR/MM/HCI NLP Research Topics • Yee Fan Tan - Record Matching in Digital Libraries • Jin Zhao - Math Equation IR • Jesse Gozali – Phototaking Behavior • Ziheng Lin – Rhetorical Discourse Analysis • Cong Duy Vu Hoang – Related Work Summarization • Jun Ping Ng – Logic in Question Answering • Aobo Wang – Crowdsourcing for Machine Translation • Shihong Huang, Wai Hong Loh – Tooltip translator for Firefox • Kazunari Sugiyama - Recommender Systems in Digital Libraries • Minh Thang Luong – ForeCite • Emma Thuy Dung Nguyen – ForeCite Incoming Staff (4 UROP, 1 Intern): • Shomir Wilson (Intern) – Mention Detection in Scientific Articles, w/ Jin • Shawn Tan (UROP) – Continuing PARCELS, w/ Jesse • Tamisa Huangsiri, Low Wee Hung – (UROP) CSIDM Firefox w/ Aobo, Jun Ping • Yipeng Huang (UROP) – Cloud Data Cleaning, w/ Yee Fan, Jin WING, NUS
Responsibilities (to be discussed) • Kaz: Non-CSIDM UROP guidance • Yee Fan: None (Thesis Writing!!) • Jin: RPNLPIR / Meeting and Room Bookings • Ziheng: Publication Page / Joomla / Social • Jesse: RoR / FC / CSX • Aobo: RoR / Web System Admin • Jun Ping: System Admin Lead WING, NUS
Fixed IP CTE – RAID drive host, LDAP host, source code repository AYE – webserver, mailserver, mailman, virtual host on ECP DHCP (.ddns.) ECP – LDAP backup PIE – compute server Windows Server (.ddns) KPE KJE BKE SLE Cluster Architecture • Systems named after Singapore’s highways WING, NUS
OS support All *nix group machines run CentOS 5 • stable Linux Enterprise distribution • all mount cte’s raid drive, plus other automounts Future • use rsync to sync all binaries across machines • expand RAID to encompass disks over different machines for more space (more SAN like) WING, NUS
RAID setup • Currently 5.0 TB in RAID 5? • ext3 mounted to cte • /mnt/homes – home directories • /mnt/rpnlpir-indep – machine indep data (datasets) • /mnt/rpnlpir-Linux – binaries • /mnt/rpnlpir-Windows – binaries Future • DB server coming online for Rails applications WING, NUS
Webserver (aye.comp.nus.edu.sg) • Apache • Virtual hosts (wing.comp, linc.comp, opac.comp) • Hosts Tomcat for java servlets • Hosts gmond (Gangila monitor) • Runs webalizer for stats • Hosts Ruby on Rails apps (Trung’s myror script; to be deprecated soon) • Hosts web service server (router for web service calls) WING, NUS
Web Services • Our infrastructure tuned to make many services and demos by web services. • External calls to port 4000 • List of Webservices on http://wing.comp.nus.edu.sg/~forecite/ • Calls handled by WebServiceServer (WSS) ruby code. • Directory for webservices currently at /home/forecite/services/ WING, NUS
Joomla • For our website • Administration by admin@wing, PhD students Customizations • Forum integration (phpbb) • Forum has contact information for all staff • Forum userdb not yet synched with shadow pass in LDAP • RPNLPIR (resource list) • Blog WING, NUS
Mailing List • mailman run on aye • lists also run on wing (alias for aye) • both local and international mailing list hosted here WING, NUS
LDAP • To keep logins/uids/guids synched • Main server on cte • Backup on aye • Needs to be robust in case of failure of LDAP server • Local root for all machines must be maintained WING, NUS
RPNLPIR (Research Project for NLP / IR) • Common team account • Keep software repository mirrored by web page listing • Keeps CVS repo in ~/CVSDir • Keeps git repo in ~/repo • Accessible to all group members WING, NUS