Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Jan 26, 2021. It is now read-only.

Support asymmetric Dirichlet prior#22

Open
hiyijian wants to merge 3 commits into
microsoft:master from
hiyijian:master
Open

Support asymmetric Dirichlet prior #22
hiyijian wants to merge 3 commits into
microsoft:master from
hiyijian:master

Conversation

@hiyijian

@hiyijian hiyijian commented Jan 19, 2016

Copy link
Copy Markdown
Contributor

According to Wallach’s paper, asymmetric, hierarchical Dirichlet prior over the document–topic distributions and a symmetric Dirichlet prior over the topic–word distributions results in significantly better model
This PR supports asymmetric alpha in following steps:

  1. Add two extra tables to Multiverso. One is topic frequency table, a matrix to count each topics’ frequency. The other one is doc length table, a row to count how many document is with length k.
  2. Initialize the two extra tables with random initialized documents
  3. Learn alpha distribution with the two extra table every 5 iterations
  4. Build alias table for leanred alpha distribution
  5. Sample topics with learned alpha distribution and alias table. Meanwhile, update countings of topic frequency table if necessary

To use this new feature, please just run with an extra option "-num_alpha_iterations".

Please notice that there are two TODOs. One is Evaluation in asymmetric prior mode, the other is Inference with asymmetric prior.

Comment thread src/trainer.cpp Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only need to request the table if iter % num_alpha_iteration == 0?

feiga commented Jan 25, 2016

Copy link
Copy Markdown
Contributor

Thanks for the great work!@hiyijian I'm really sorry for my late response. The implementation should be OK. I review the code and add some notes. I think it's OK to merge to the master.

Copy link
Copy Markdown

@hiyijian I'm trying to use asymmetric lda. count you tell me your QQ or Wechat? I have some questions.

Copy link
Copy Markdown
Contributor Author

@lisendong My implemention is on the top of source codes provided by @feiga , which I think is used in Microsoft only. However, my implemention is incorrect, I think. Fell free to concact me via @hiyijian@qq.com

Copy link
Copy Markdown

@hiyijian I'm tring to use your asymmetric prior version of lightlda that based on microsoft's code,I am seeing that you said "my implemention is incorrect",is that meaning your code still had some bug?

Copy link
Copy Markdown

hi @hiyijian @feiga, why is this not merged yet? Are there are issues with the implementation?

Copy link
Copy Markdown
Contributor Author

hi @tangzhenyu and @koustuvsinha. I said "this implemention is incorrect" since I have not seen any improvement compared by feiga's original symmetric LDA, Sometimes seems even worse :( .

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /