tag:blogger.com,1999:blog-2316765421340036602.post7257629030793605443..comments2024-01-10T11:38:15.547-08:00Comments on Random Observations: Design of a reporting systemBen Tillyhttp://www.blogger.com/profile/04335648152419715383noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-2316765421340036602.post-67789363640359808152010-01-09T03:15:32.935-08:002010-01-09T03:15:32.935-08:00This comment has been removed by a blog administrator.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-41524595489702541072009-12-23T01:48:14.359-08:002009-12-23T01:48:14.359-08:00In general, and hopefully the server have password...In general, and hopefully the server have passwords encrypted.<br /><br />And it would be insecure, and rather inconvenient, to ask users to write their credentials in an IQY file.<br /><br />Excel and IE are 2 separate worlds, I don't think you can share cookies between them.<br /><br />But there are other browsers and OpenOffice too.Yves Hiernauxhttps://www.blogger.com/profile/13099150066333934756noreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-77769863834942303422009-12-23T00:26:02.101-08:002009-12-23T00:26:02.101-08:00That is annoying. It is just like Microsoft to tr...That is annoying. It is just like Microsoft to try to use their control of one technology (Excel) to push others (ASP.net).<br /><br />One possible end run is to try creating an IQY file and adding the name and password in POST parameters. See <a href="http://articles.techrepublic.com.com/5100-10878_11-6115870.html?part=rss&tag=feed&subj=tr" rel="nofollow">this article</a> for information on how to set that up. That moves the authentication information into the spreadsheet. If you make the URL be https, then the password is now transmitted in encrypted form, and you are as secure as your Excel worksheet is.<br /><br />Another thing that I'm wondering is whether you can solve the problem using cookies. Using session cookies might not work because it probably launches a new IE instance for each web request. But what happens if you set short lived persistent cookies (say 2 hours)? Those should be shared across different launches of IE, and may resolve your problem nicely. (You may need to do some extra work on the server side to use persistent cookies for authentication rather than session cookies.)Ben Tillyhttps://www.blogger.com/profile/04335648152419715383noreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-5675735383333836502009-12-22T23:24:00.022-08:002009-12-22T23:24:00.022-08:00As long as you are on MSFT platforms and use NTLM,...As long as you are on MSFT platforms and use NTLM, all is fine.<br /><br />You browse an intranet pages with few reports. You pick one, and it opens Excel.<br /><br />You are automatically authenticated, the ASP/ASPX page receives the user, can apply a user profile and the backend returns authorised data.<br /><br />Now if you are on the internet or intranet without MSFT servers, the only difference is when Excel opens, you get a popup to authenticate.<br /><br />Users get quickly annoyed as this popup comes for each report you open. <br /><br />Not to mention the fight you will get with SSO guys.<br /> <br />We used this technique for getting SAP reports for some big corporations, but dropped it as they were all moving to JAVA servers.<br /><br />There are some hackery, for intranets, to get NTLM on JAVA, but IT validation teams didn't like it in general.<br /><br />OpenOffice does not even have the luxury of NTLM, so it is always with a popup.<br /><br />Now if there is a solution for that, I'm a big fan of web query for reports.<br /><br />We considered it for our new project <a href="http://beebole.com" rel="nofollow">BeeBole</a> but had to drop it for the same reasons.Yves Hiernauxhttps://www.blogger.com/profile/13099150066333934756noreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-5236070129909753442009-12-22T22:20:26.327-08:002009-12-22T22:20:26.327-08:00@Yves: I had the luxury of not having to worry abo...@Yves: I had the luxury of not having to worry about authentication because the application was only on the internal network, and the execs were OK with not having any security on it.<br /><br />But if I did need security, I would have tried adding authentication on the web page (by configuring HTTP digest authentication), and then seeing whether Excel handled it. I suspect it would have.<br /><br />Once you have authentication, you can set up authorization in any way you want. That is a much trickier issue because the fundamental goal is to let the organization restrict access to data in the way that it wants. And there is no way to guess what kinds of rules they will want without knowing the organization.Ben Tillyhttps://www.blogger.com/profile/04335648152419715383noreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-49538021094932851432009-12-22T20:47:31.283-08:002009-12-22T20:47:31.283-08:00Hi Ben,
I am a firm believer in your model and th...Hi Ben,<br /><br />I am a firm believer in your model and the recent moves of Microsoft and Google Apps in the online sheets sharing is just another good step.<br /><br />I would really be interested in knowing how you created a secure way to link excel sheets and web apps.<br /><br />Was it an API ? Do the users have to enter login/pass each time in the sheet ?<br /><br />How do you control authorization and access to data based on role, or something, ... ?<br /><br />It would be really great to know a bit more.Yves Hiernauxhttps://www.blogger.com/profile/13099150066333934756noreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-44167504803851828452009-12-15T07:51:15.551-08:002009-12-15T07:51:15.551-08:00@btilly: I think you're right that no reportin...@btilly: I think you're right that no reporting package can go into a random company's data and automatically make sense of whatever they've got. I do, however, think that if you focus on a specific industry that tends to store data in certain ways and tends to want a finite set of metrics, then there are very large efficiencies to be had through a third party focusing on nothing but analytics and reporting for these companies. I used to do this manually at my old job, and now build software that automates a large portion of the process. <br /><br />Also, I disagree that this presents a big risk around vendor lock in. If you were to outsource your database analytics to a third party, you can fire them at any time, and you still have 100% of your data with zero effort, because your database is still in house. You can't say the same for the vast majority of outsourced, software as a service categories that are popular. I'm thinking of web traffic analytics, crm, project management, etc. With those tools, you're dependent on third party vendors for functionality <b>and</b> the underlying data. In many cases you can get at the underlying data through an API, but it's not as easy as if the data remained in house on your own db, and you have to work to get it into a format that is useful to you. Then again, you may prefer to use internally built software for some or all of those categories to avoid vendor lock in. <br /><br />In any case, thanks for your response.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-37626637232617028572009-12-14T11:29:38.171-08:002009-12-14T11:29:38.171-08:00@Aristotle: It does all of that? I'm surprise...@Aristotle: It does all of that? I'm surprised. Other ones that I have looked at don't. They have particularly had trouble with subtle details about conditions in ON clauses (including the need to rearrange ON clauses in some cases).<br /><br />How does Fey do with temporary tables created (and sometimes dropped then recreated) on the fly? Looking at Fey it looks like it wants to know your schema, but there is no permanent schema there.<br /><br />I ask because I have found that temporary tables have a number of advantages that I would not want to lose. I still mean to get around to a post on them in a few days, but just for a simple example, I have run across situations where the database didn't find a good execution plan for a multi-table join. With temporary tables you can force the execution plan by joining some things into a temporary table, adding indexes, then doing the main join.<br /><br />This has sped up slow reports often enough that I wouldn't want to lose that tool from my toolbox.<br /><br />That aside, even if Fey can do everything that SQL can, I still don't see the value add. You say that it gives me an OO syntax. But I don't care about OO syntax when looking at reports that basically are a series of SQL statements, display the result of the last one. You say that it lets you rearrange clauses. That's a pretty minor win from my point of view. I'd need something a lot bigger to add the dependency.Ben Tillyhttps://www.blogger.com/profile/04335648152419715383noreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-34863890953121483622009-12-14T11:00:13.077-08:002009-12-14T11:00:13.077-08:00- Performing left joins
- Distinguishing between w...<i>- Performing left joins<br />- Distinguishing between whether a condition goes in the WHERE or the ON clause of a left join (they are different)<br />- Handling joining to the same table multiple times.<br />- Handling self joins.<br />- Distinguishing between the following two:</i><br /><br />Fey passes all of these tests. There are still some things it has trouble with, but you have to get more creative than these. Plus, Dave’s goal is to map out pretty well everything that can be done with plain SQL; if Fey::SQL cannot yet express it, most likely it will be added (particularly if you proffer a patch).<br /><br />Mind, this isn’t too surprising as it doesn’t try to <em>simplify</em> SQL writing for you in any way, anyway. It just gives you use OO syntax that corresponds 1:1 to SQL.<br /><br />The point of that is it lets you write queries using symbols instead of strings, ie replaces table and column names with Fey::Column and Fey::Table along with corresponding ::Alias classes. It also lets you combine clauses in arbitrary order instead of the fixed order in which SQL demands them.<br /><br />The upshot is that you can extract generic aspects from queries and write them in a reusable way.<br /><br />F.ex. I used this to write a generic tagging (as in “folksonomies”) query aspect: I pass in a query object, a record table object, a tag table object, and a bunch of tags, and the function takes each tag, generates an anonymous tag table alias for it, joins that against the record table, and puts a <b>WHERE</b> condition on the <b>tagname</b> column <em>for that alias</em>. Now I don’t have to write query templates carefully laid out to make it possible to add tagging to them, nor go back and rearrange them if I didn’t write them that way to begin with. I can pass any query plus specific tables to this function, and it will weave the requisite bits into the query to restrict it by some tags.<br /><br />It’s nice.<br /><br />Building queries using templates as I did before was much more complicated and hairy.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-88432162456840401232009-12-13T21:18:54.173-08:002009-12-13T21:18:54.173-08:00@Aristotle: I confess that I've never seen the...@Aristotle: I confess that I've never seen the point of using SQL generators for complex SQL.<br /><br />ORMs are one thing - if someone is trying to develop something in a standard programming language then it makes sense to provide an abstraction layer to make the database details invisible within application code.<br /><br />However complex queries are a different story. SQL is a reasonably standard and widely understood language that is more compact and expressive for what it does than any query builder that I have seen. What then is the point of learning another layer so that you can program in a less expressive way? Particularly when you wind up with arbitrary restrictions when you are trying to do complex things with the database. Fey claims to be better at that issue than alternatives. Maybe so. But I'd be surprised if there weren't things it misses that SQL has. Places I have seen issues in the past include<br />- Performing left joins<br />- Distinguishing between whether a condition goes in the WHERE or the ON clause of a left join (they are different)<br />- Handling joining to the same table multiple times.<br />- Handling self joins.<br />- Distinguishing between the following two:<br /><br /> SELECT ...<br /> FROM foo<br /> LEFT JOIN bar<br /> ON foo.bar_id = bar.id<br /> LEFT JOIN baz<br /> ON bar.baz_id = baz.id<br /> AND baz.some_field <> 'cond'<br /><br />and<br /><br /> SELECT ...<br /> FROM foo<br /> LEFT JOIN (<br /> bar<br /> LEFT JOIN baz<br /> ON bar.baz_id = baz.id<br /> )<br /> ON foo.bar_id = bar.id<br /> AND baz.some_field <> 'cond'<br /><br />(The parentheses are actually not needed in the last case, but stacking ON clauses without them gets confusing.)<br /><br />If you're not having to work closely with SQL then these may seem like esoteric points. However when you're doing reporting in a serious way you encounter these and more fairly regularly.<br /><br />Moving on, I actually wasn't using ginormous SQL queries. Instead I was breaking them up into a series of data steps using temp tables. The benefits of this approach are sufficiently big and non-obvious that I think I need another blog entry on it when I get time. But as for just one - it is easy to get well-formatted SQL for complex reports that is broken into digestible chunks which end users can follow.Ben Tillyhttps://www.blogger.com/profile/04335648152419715383noreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-62438335702103275362009-12-13T21:07:53.740-08:002009-12-13T21:07:53.740-08:00@jakestein: When I started in that job the first t...@jakestein: When I started in that job the first thing that I looked at was the viability of third party reporting solutions.<br /><br />However it quickly became clear that one of the biggest problems that organization had was a sprawling collection of different databases with inconsistent (and generally not well thought out) schemas for information. For example I had to look in different databases for user information, data on promotions that had been run, tracking of order fulfillment, and financial data. That made it difficult to produce reports that combined all of those different types of data. As you moved from table to table, fields would frequently change names and data types. And of course the only place this was documented was in the brains of the people who developed and QAed it.<br /><br />Users already had direct access to that data. But nobody had the domain knowledge to make heads or tails from it. In a few weeks of searching I didn't find any third party solutions that offered a way to reverse engineer any type of sanity on top of a sprawling data mess like that. I knew that I could grow a system that did.<br /><br />As for the idea of exporting the whole problem to a third party, that strikes me as a bad business decision. Understanding of data formats and structure is institutional knowledge. A third party vendor has real disadvantages and no real advantages over the company in learning that knowledge. Some of that data is highly sensitive. And once a company is dependent on a third party for understanding its internal data, you've got a level of vendor lock-in that is going to cost you dearly down the read. (Read <a href="http://www.inforules.com/" rel="nofollow">Information Rules</a> for common business strategy around third party vendors and vendor lock-in.)Ben Tillyhttps://www.blogger.com/profile/04335648152419715383noreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-16162125456820830512009-12-13T07:31:50.871-08:002009-12-13T07:31:50.871-08:00Ben,
I definitely agree with you on the importan...Ben, <br /><br />I definitely agree with you on the importance of integrating with excel and specifying relative dates when building a reporting system. Did you think about trying out an externally developed reporting package rather than building internally? IMO, it's better to give the consumers of reports the ability to go through the entire process of creating, consuming, and modifying reports without incremental work from you / the tech team. Of course, there will be times where users get stuck, but then support is the vendor's problem, no yours. Plus, you're not responsible for bug fixes, new features, etc. <br /><br />Disclosure: I'm <a href="http://www.rjmetrics.com" rel="nofollow">more than a little biased</a>. This is something that I have spent a lot of time on, and would be curious to hear your thoughts.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2316765421340036602.post-46223987979852537902009-12-13T02:47:02.737-08:002009-12-13T02:47:02.737-08:00I use Dave Rolsky’s Fey to piece together queries,...I use Dave Rolsky’s <a href="http://p3rl.org/Fey" rel="nofollow">Fey</a> to piece together queries, and wouldn’t want to give it up for anything.<br /><br />Of course, the queries aren’t formatted any particular way, and the <b>SELECT</b> clause generally grows unwieldy – so the resulting queries couldn’t be shown to end users, at least not easily. But I would sooner be willing to spend effort to solve this issue in some way (<a href="http://p3rl.org/SQL::Beautify" rel="nofollow">SQL::Beautify</a>?) than I would be to go back to treating SQL as strings.Anonymousnoreply@blogger.com