I am writing a small app that will among other things expand shortcuts into full text while typing. example: the user writes "BNN" somewhere and presses the relevant keyboard combination, the app would replace the "BNN" with a "Hi I am Banana".
after some research i learned that it can be done using user32.dll and the process of achieving this task is as follows:
1) get the active window handle
2) get the active window thread handle
3) attach input to active thread
4) get focused control handle (+caret position but that is not the issue)
5) detach input from active thread
6) get the text from the focused control using its handle
and here is my code so far:
try
{
IntPtr activeWindowHandle = GetForegroundWindow();
IntPtr activeWindowThread = GetWindowThreadProcessId(activeWindowHandle, IntPtr.Zero);
IntPtr thisWindowThread = GetWindowThreadProcessId(this.Handle, IntPtr.Zero);
AttachThreadInput(activeWindowThread, thisWindowThread, true);
IntPtr focusedControlHandle = GetFocus();
AttachThreadInput(activeWindowThread, thisWindowThread, false);
if (focusedControlHandle != IntPtr.Zero)
{
TB_Output.Text += focusedControlHandle + " , " + GetText(focusedControlHandle) + Environment.NewLine;
}
}
catch (Exception exp)
{
MessageBox.Show(exp.Message);
}
//...
//...
[DllImport("user32.dll", CharSet = CharSet.Auto, ExactSpelling = true)]
internal static extern IntPtr GetForegroundWindow();
[DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
internal static extern int GetWindowThreadProcessId(int handle, out int processId);
[DllImport("user32", CharSet = CharSet.Ansi, SetLastError = true, ExactSpelling = true)]
internal static extern int AttachThreadInput(IntPtr idAttach, IntPtr idAttachTo, bool fAttach);
[DllImport("user32.dll", CharSet = CharSet.Auto, ExactSpelling = true)]
internal static extern IntPtr GetFocus();
this works perfectly for some windows forms apps but it doesnt work with WPF nor browsers, just gives me the title of the WPF app or the title of the tab in chrome.
if i run the app on this page while typing this question for instance, instead of the content of the question, the text i get is:
Get text from inside google chrome using my c# app - Stack Overflow - Google
probably because they use graphics to render the elements, and im not sure how i can get to the active element and read it's text.
i only referred to web browsers in the question's title because this tool will be mostly used with web browsers.
thank you in advance for any feedback.
-
2Not sure if it is the best approach, I would go developer.chrome.com/extensions/devguide It is doable imho, but hooking into the web browser could trigger AV software like hell.Cleptus– Cleptus2018年04月24日 13:47:14 +00:00Commented Apr 24, 2018 at 13:47
-
@bradbury9 i considered making an extension but it causes too many problems, the main one being that this tool will be used mostly with chrome but not only, so i cant restrict it to a chrome extension. or any other browser extension actually. +its easier to maintain and update as an app if i install it to my whole company...Banana– Banana2018年04月24日 13:50:29 +00:00Commented Apr 24, 2018 at 13:50
-
@bradbury9 arranging an exception in our overly protective anti virus is not a problem.Banana– Banana2018年04月24日 13:51:27 +00:00Commented Apr 24, 2018 at 13:51
-
1If you want to do that in web browsers and WPF apps, you will have to create a keylogger that constantly monitors the keyboard and replaces the text simulating the keyboard input. WPF controls have no Windows handles, so WinAPI is useless for them. Same for the controls rendered in the web browsers.dymanoid– dymanoid2018年05月29日 16:49:45 +00:00Commented May 29, 2018 at 16:49
-
@dymanoid thanks for the input, technically my app already is a keylogger as it monitors for the combination of keys that triggers the expanding. I am aware unfortunately that browsers and WTF window controsl have no handles (since they are technically graphical objects), but maybe there is a creative way of achieving this? spell checkers do manage to do it somehow, why cant we?Banana– Banana2018年05月29日 16:54:41 +00:00Commented May 29, 2018 at 16:54
2 Answers 2
I would personally attempt to create a library which chrome prefers. There are many available such as Kantu, which is specialized for Chrome.
Examples: TestCafe, Watir, SlimerJS
Comments
I think that library is not the optimal way to do what you want. I would use a library more suited to browser DOM manipulation, like Selenium.