0

I have a list of observations where each data point is a pair of a time expression (e.g. night, morning) and an hour in a 12-hr clock (i.e. 1, 2, ..., 12): Y = {<e_i, h_i>}_i={1,...,N}. I would like to estimate the distribution of hours in a 24-hr clock given a time expression (or equivalently, classify each data point to AM or PM).

I have a feeling EM would be useful here given the hidden AM/PM variable, but I'm struggling to define the parameters. In all other examples I've used EM for, something is assumed about the distribution that generated the observations (e.g. that it is a normal distribution, or document classification based on bag-of-words). But I'm not sure how to define it here.

I'd appreciate any help!

asked Oct 22, 2021 at 1:07

1 Answer 1

0

I ended up solving it as an ILP problem:

I defined a binary variable for each combination of 12 hr and time expression (true if it is PM, false if AM), and start time and end time variables for each expression. My constraints were the order of time expressions, e.g. morning ends before noon starts, etc. I maximized the number of observations that fit within the start and end time for each expression.

answered Oct 26, 2021 at 16:47
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.